- In the NumClusters argument, specify multiple values.
For example, NumClusters (3, 3, 4) fits 3 models, two with 3 clusters and one model with 4 clusters. It is good practice to try multiple initializations when fitting KModes, which is why you might use the same number more than once.
- Use the RandomSample function to select multiple sets of rows from the input data table, and use these randomly selected samples as seeds in the KModes function.
Create a table from a call to RandomSample. Give the NumSample argument a set of values x 1, x 2, …, x n where n is the number of different sets of rows to create (this becomes the number of models later created by KModes) and x i is the number of seed rows to select for each model (this determines the number of clusters in model i later created by KModes). The table column set_id identifies each set of points.
Call KModes, giving the InitialSeedTable argument that name of the table you created and specifying ModelIDColumn ('set_id').
You can create multiple models and compare their metrics to find the best model. There are two ways to create multiple models simultaneously: