MinHash Input - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
Table Description
InputTable Contains items to cluster.
SeedTable [Optional. Disallowed with SaveSeedTo argument.] Contains seeds to use for hashing. Typically created by earlier MinHash call that specified its name with SaveSeedTo argument.

InputTable Schema

The table can have additional columns, but the function ignores them.

Column Data Type Description
user_id_column Any Identifier to cluster (for example, user identifier).
item_id_column BIGINT, INTEGER, or VARCHAR Identifiers of items on which to base clustering (for example, items that user purchased). Items are separated by delimiter.

SeedTable Schema

Column Data Type Description
index INTEGER Hash identifier. Values are from 0 to number_of_hash_functions - 1.
a INTEGER Seed value that hash function used to create hash values that MinHash algorithm used.
b INTEGER Seed value that hash function used to create hash values that MinHash algorithm used.
p INTEGER Seed value that hash function used to create hash values that MinHash algorithm used.