7.00.02 - Minhash Arguments - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Release Date
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
InputTable
Specifies the name of the input table.
OutputTable
Specifies the name of the output table.
IDColumn
Specifies the name of the input table column that contains the IDs to be clustered. Typically these values are customer identifiers.
ItemsColumn
Specifies the name of the input column that contains the values to use for hashing.
SeedTable
[Optional] Specifies the name of the table that contains the seeds to use for hashing. Typically, this table was created by an earlier Minhash call that specified its name in the SaveSeedTo argument.
SaveSeedTo
[Optional] Specifies the name of the table where seeds are to be saved.
HashNum
Specifies the number of hash functions to generate. The number_of_hash_functions determines the number and size of clusters generated.
KeyGroups
Specifies the number of key groups to generate. The number_of_keygroups must be a divisor of number_of_hash_functions. A large number_of_keygroups decreases the probability that multiple users are assigned to the same cluster identifier.
InputFormat
[Optional] Specifies the format of the values to be hashed (the values in items_column). Default: 'integer'.
MinClusterSize
[Optional] Specifies the minimum cluster size. Default: 3.
MaxClusterSize
[Optional] Specifies the maximum cluster size. Default: 5.
Delimiter
[Optional] Specifies the delimiter used between hashed values (typically customer identifiers) in the output. Default: Space character.