Ways to Create Multiple Models - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™
You can create multiple models and compare their metrics to find the best model. There are two ways to create multiple models simultaneously:
  • In the NumClusters argument, specify multiple values.

    For example, NumClusters (3, 3, 4) fits 3 models, two with 3 clusters and one model with 4 clusters. It is good practice to try multiple initializations when fitting KModes, which is why you might use the same number more than once.

  • Use the RandomSample function to select multiple sets of rows from the input data table, and use these randomly selected samples as seeds in the KModes function.

    Create a table from a call to RandomSample. Give the NumSample argument a set of values x 1, x 2, …, x n where n is the number of different sets of rows to create (this becomes the number of models later created by KModes) and x i is the number of seed rows to select for each model (this determines the number of clusters in model i later created by KModes). The table column set_id identifies each set of points.

    Call KModes, giving the InitialSeedTable argument that name of the table you created and specifying ModelIDColumn ('set_id').