This example uses the table of cluster centroids output by a KMeans function example.
Input
- InputTable1: computers_test1
The test data in this table is personal computer attributes—price, speed, hard disk size, RAM, and screen size. The table has over 1000 rows. If a row contains a null value, KMeansPredict assigns the cluster ID -1 to that row.
- InputTable2: kmeanssample_centroid, the table of cluster centroids output by KMeans Example: NumClusters, UnpackColumns ('true')
id | price | speed | hd | ram | screen |
---|---|---|---|---|---|
10 | 2575 | 50 | 210 | 4 | 15 |
11 | 2195 | 33 | 170 | 8 | 15 |
15 | 2699 | 50 | 212 | 8 | 14 |
29 | 3095 | 33 | 340 | 16 | 14 |
30 | 3244 | 66 | 245 | 8 | 14 |
38 | 3795 | 66 | 500 | 8 | 14 |
45 | 3495 | 50 | 340 | 16 | 14 |
46 | 2695 | 33 | 245 | 8 | 14 |
48 | 1749 | 25 | 120 | 4 | 14 |
51 | 2499 | 33 | 170 | 4 | 14 |
52 | 2395 | 33 | 130 | 4 | 14 |
59 | 2945 | 66 | 210 | 8 | 17 |
65 | 2195 | 66 | 85 | 2 | 14 |
66 | 1495 | 25 | 170 | 4 | 14 |
70 | 3095 | 66 | 245 | 8 | 14 |
86 | 1999 | 33 | 120 | 8 | 14 |
91 | 2975 | 50 | 210 | 4 | 17 |
92 | 2145 | 66 | 130 | 4 | 14 |
93 | 2420 | 33 | 170 | 8 | 15 |
94 | 2505 | 50 | 210 | 8 | 14 |
104 | 2999 | 66 | 330 | 4 | 15 |
... | ... | ... | ... | ... | ... |
SQL Call
SELECT * FROM KMeansPredict( ON computers_test1 AS InputTable1 PARTITION BY ANY ON kmeanssample_centroid2 AS InputTable2 DIMENSION ) AS dt;
Output
id clusterid price speed hd ram screen ---- --------- ------ ----- ------ ---- ------ 5628 5 1749.0 66.0 545.0 8.0 15.0 734 5 1879.0 50.0 210.0 4.0 15.0 122 7 2199.0 33.0 210.0 4.0 14.0 3140 5 1744.0 50.0 107.0 2.0 14.0 591 7 2245.0 33.0 250.0 8.0 15.0 3405 0 1195.0 25.0 107.0 2.0 14.0 3609 4 1695.0 50.0 214.0 4.0 14.0 4343 4 1588.0 33.0 340.0 4.0 15.0 5485 2 3090.0 100.0 1000.0 24.0 15.0 5811 5 1740.0 50.0 528.0 8.0 15.0 ...
Download a zip file of all examples and a SQL script file that creates their input tables.