Input
The InputTable has five attributes of personal computers (price, speed, hard disk size, RAM, and screen size). The table has over 6000 rows. These examples use different syntax elements to find eight clusters based on the five attributes.
id | price | speed | hd | ram | screen |
---|---|---|---|---|---|
1 | 1499 | 25 | 80 | 4 | 14 |
2 | 1795 | 33 | 85 | 2 | 14 |
3 | 1595 | 25 | 170 | 4 | 15 |
4 | 1849 | 25 | 170 | 8 | 14 |
5 | 3295 | 33 | 340 | 16 | 14 |
6 | 3695 | 66 | 340 | 16 | 14 |
7 | 1720 | 25 | 170 | 4 | 14 |
8 | 1995 | 50 | 85 | 2 | 14 |
9 | 2225 | 50 | 210 | 8 | 14 |
12 | 2605 | 66 | 210 | 8 | 14 |
13 | 2045 | 50 | 130 | 4 | 14 |
14 | 2295 | 25 | 245 | 8 | 14 |
16 | 2225 | 50 | 130 | 4 | 14 |
17 | 1595 | 33 | 85 | 2 | 14 |
18 | 2325 | 33 | 210 | 4 | 15 |
19 | 2095 | 33 | 250 | 4 | 15 |
20 | 4395 | 66 | 452 | 8 | 14 |
... | ... | ... | ... | ... | ... |
SQL Call
This call tries to group the 5-dimensional data points into 8 clusters. UnpackColumns is 'false' by default.
SELECT * FROM KMeans ( ON computers_train1 AS InputTable OUT TABLE OutputTable (kmeanssample_centroid) USING NumClusters (8) StopThreshold (0.05) MaxIterNum (10) ) AS dt;
Output
clusterid mean size withinss --------- ------------------------------------------------------------------------------------ ---- ---------------- 0 3072.21428571429 52.6428571428571 271.071428571429 8.57142857142857 14.7142857142857 14 586706.785714269 1 1740.88888888889 30.3333333333333 141.666666666667 4.66666666666667 14.1111111111111 9 51205.7777777761 2 1481.14285714286 27.2857142857143 112.857142857143 3.14285714285714 14.1428571428571 7 81104.8571428545 3 1991.0 38.8461538461538 160.384615384615 4.46153846153846 14.0 13 51006.0000000149 4 2615.75 41.3333333333333 220.166666666667 7.33333333333333 14.75 12 140175.50000003 5 2371.75 48.75 204.625 6.0 14.25 8 36636.375 6 2192.8 38.1 194.4 5.2 14.1 10 41549.3999999911 7 4026.42857142857 59.0 432.285714285714 9.14285714285714 14.2857142857143 7 324109.428571463 --------- ------------------------------------------------------------------------------------ ---- ---------------- Converged : False Number of Iterations : 10 Number of clusters : 8 Successfully created Output table Total_WithinSS : 1312494.1242063986 Between_SS : 3.8234387625793636E7
SELECT * FROM kmeanssample_centroid;
clusterid mean size withinss --------- ------------------------------------------------------------------------------------ ---- ---------------- 1 1740.88888888889 30.3333333333333 141.666666666667 4.66666666666667 14.1111111111111 9 51205.7777777761 3 1991.0 38.8461538461538 160.384615384615 4.46153846153846 14.0 13 51006.0000000149 5 2371.75 48.75 204.625 6.0 14.25 8 36636.375 7 4026.42857142857 59.0 432.285714285714 9.14285714285714 14.2857142857143 7 324109.428571463 0 3072.21428571429 52.6428571428571 271.071428571429 8.57142857142857 14.7142857142857 14 586706.785714269 2 1481.14285714286 27.2857142857143 112.857142857143 3.14285714285714 14.1428571428571 7 81104.8571428545 4 2615.75 41.3333333333333 220.166666666667 7.33333333333333 14.75 12 140175.50000003 6 2192.8 38.1 194.4 5.2 14.1 10 41549.3999999911
Download a zip file of all examples and a SQL script file that creates their input tables.