Minhash Arguments - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software
InputTable
Specifies the name of the input table.
OutputTable
Specifies the name of the output table.
IDColumn
Specifies the name of the input table column that contains the IDs to be clustered. Typically these values are customer identifiers.
ItemsColumn
Specifies the name of the input column that contains the values to use for hashing.
SeedTable
[Optional] Specifies the name of the table that contains the seeds to use for hashing. Typically, this table was created by an earlier Minhash call that specified its name in the SaveSeedTo argument.
SaveSeedTo
[Optional] Specifies the name of the table where seeds are to be saved.
HashNum
Specifies the number of hash functions to generate. The number_of_hash_functions determines the number and size of clusters generated.
KeyGroups
Specifies the number of key groups to generate. The number_of_keygroups must be a divisor of number_of_hash_functions. A large number_of_keygroups decreases the probability that multiple users are assigned to the same cluster identifier.
InputFormat
[Optional] Specifies the format of the values to be hashed (the values in items_column). Default: 'integer'.
MinClusterSize
[Optional] Specifies the minimum cluster size. Default: 3.
MaxClusterSize
[Optional] Specifies the maximum cluster size. Default: 5.
Delimiter
[Optional] Specifies the delimiter used between hashed values (typically customer identifiers) in the output. Default: Space character.