1.0 - 8.00 - KModes Example 2: NumClusters - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)

This example specifies NumClusters ('3') to obtain three clusters. Because different cluster centers are produced each time you run the example, cluster assignments might differ.

Input

SQL Call

DROP TABLE kmodes_clusters1;

SELECT * FROM KModes (
  ON kmodes_input AS InputTable
  OUT TABLE OutputTable (kmodes_clusters1)
  USING
  NumClusters (3)
  TargetColumns ('mpg:carb')
  NumericAsCategorical ('cyl', 'gear', 'carb')
) AS dt;

Output

Output Table
set_id summary between_cluster_error total_within_cluster_error pseudo_f
0 Number of Clusters: 3 189.040612061326 110.959387951497 16.4690218375959
  Number of Iterations: 3      
  Model Converged: true      
  Number of Data Points: 32.0      

This query returns the following table:

SELECT * FROM kmodes_clusters1 ORDER BY 1;
kmodes_clusters1
set_id cluster_id mpg disp hp drat wt qsec cyl vs am gear carb within_cluster_ss cluster_weight distance_metric category_weights
0 0 -0.6870185315 0.45206321975 1.576108211 0.338404144 0.122897527 -1.592803813 8 S manual 5 4 12.1398871106861 4 EUCLIDEAN, OVERLAP [1.0, 1.0, 1.0, 1.0, 1.0]
0 1 -0.694038285461538 0.884442002384615 0.440990547615385 -1.03517409530769 0.806587495538462 -0.0892693328461538 8 S automatic 3 2 39.4522701602673 13    
0 2 0.7847047892 -0.887066594 -0.802487331133333 0.8069097776 -0.731815169933333 0.502114438733333 4 V manual 4 2 59.3672306805433 15