KModes Example 2: NumClusters - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

This example specifies NumClusters ('3') to obtain three clusters. Because different cluster centers are produced each time you run the example, cluster assignments might differ.

Input

SQL Call

DROP TABLE kmodes_clusters1;

SELECT * FROM KModes (
  ON kmodes_input AS InputTable
  OUT TABLE OutputTable (kmodes_clusters1)
  USING
  NumClusters (3)
  TargetColumns ('mpg:carb')
  NumericAsCategorical ('cyl', 'gear', 'carb')
) AS dt;

Output

Output Table
set_id summary between_cluster_error total_within_cluster_error pseudo_f
0 Number of Clusters: 3 189.040612061326 110.959387951497 16.4690218375959
  Number of Iterations: 3      
  Model Converged: true      
  Number of Data Points: 32.0      

This query returns the following table:

SELECT * FROM kmodes_clusters1 ORDER BY 1;
kmodes_clusters1
set_id cluster_id mpg disp hp drat wt qsec cyl vs am gear carb within_cluster_ss cluster_weight distance_metric category_weights
0 0 -0.6870185315 0.45206321975 1.576108211 0.338404144 0.122897527 -1.592803813 8 S manual 5 4 12.1398871106861 4 EUCLIDEAN, OVERLAP [1.0, 1.0, 1.0, 1.0, 1.0]
0 1 -0.694038285461538 0.884442002384615 0.440990547615385 -1.03517409530769 0.806587495538462 -0.0892693328461538 8 S automatic 3 2 39.4522701602673 13    
0 2 0.7847047892 -0.887066594 -0.802487331133333 0.8069097776 -0.731815169933333 0.502114438733333 4 V manual 4 2 59.3672306805433 15