1.1 - 8.10 - Ways to Create Multiple Models - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)
You can create multiple models and compare their metrics to find the best model. There are two ways to create multiple models simultaneously:
  • In the NumClusters syntax element, specify multiple values.

    For example, NumClusters (3, 3, 4) fits 3 models, two with 3 clusters and one model with 4 clusters. It is good practice to try multiple initializations when fitting KModes, which is why you might use the same number more than once.

  • Use the RandomSample function to select multiple sets of rows from the input data table, and use these randomly selected samples as seeds in the KModes function.

    Create a table from a call to RandomSample. Give the NumSample syntax element a set of values x 1, x 2, …, x n where n is the number of different sets of rows to create (this becomes the number of models later created by KModes) and x i is the number of seed rows to select for each model (this determines the number of clusters in model i later created by KModes). The table column set_id identifies each set of points.

    Call KModes, giving the InitialSeedTable syntax element that name of the table you created and specifying ModelIDColumn ('set_id').