This example specifies NumClusters ('3') to obtain three clusters. Because different cluster centers are produced each time you run the example, cluster assignments might differ.
Input
- InputTable: kmodes_input, as in KModes Example 1: InitialSeedTable
SQL Call
DROP TABLE kmodes_clusters1; SELECT * FROM KModes ( ON kmodes_input AS InputTable OUT TABLE OutputTable (kmodes_clusters1) USING NumClusters (3) TargetColumns ('mpg:carb') NumericAsCategorical ('cyl', 'gear', 'carb') ) AS dt;
Output
set_id | summary | between_cluster_error | total_within_cluster_error | pseudo_f |
---|---|---|---|---|
0 | Number of Clusters: 3 | 189.040612061326 | 110.959387951497 | 16.4690218375959 |
Number of Iterations: 3 | ||||
Model Converged: true | ||||
Number of Data Points: 32.0 |
This query returns the following table:
SELECT * FROM kmodes_clusters1 ORDER BY 1;
set_id | cluster_id | mpg | disp | hp | drat | wt | qsec | cyl | vs | am | gear | carb | within_cluster_ss | cluster_weight | distance_metric | category_weights |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | -0.6870185315 | 0.45206321975 | 1.576108211 | 0.338404144 | 0.122897527 | -1.592803813 | 8 | S | manual | 5 | 4 | 12.1398871106861 | 4 | EUCLIDEAN, OVERLAP | [1.0, 1.0, 1.0, 1.0, 1.0] |
0 | 1 | -0.694038285461538 | 0.884442002384615 | 0.440990547615385 | -1.03517409530769 | 0.806587495538462 | -0.0892693328461538 | 8 | S | automatic | 3 | 2 | 39.4522701602673 | 13 | ||
0 | 2 | 0.7847047892 | -0.887066594 | -0.802487331133333 | 0.8069097776 | -0.731815169933333 | 0.502114438733333 | 4 | V | manual | 4 | 2 | 59.3672306805433 | 15 |