Input
The InputTable has 32 observations on 11 variables, about different models of cars.
Variable | Type | Description | drat | wt | qsec | vs | am | gear | carb |
---|---|---|---|---|---|---|---|---|---|
mpg | Normalized numerical | Miles/US gallon | -0.835197792 | 0.22254417 | -0.307088658 | S | automatic | 3 | 2 |
cyl | Numerical treated as categorical | Number of cylinders | -1.246659826 | 2.077504765 | 0.073449451 | S | automatic | 3 | 4 |
disp | Normalized numerical | Displacement (cubic inches) | 0.24956575 | 0.636460997 | -1.364760755 | S | automatic | 3 | 4 |
hp | Normalized numerical | Gross horsepower | -0.685575235 | 2.174596366 | -0.239934874 | S | automatic | 3 | 4 |
drat | Normalized numerical | Rear axle ratio | 0.473999587 | -0.917004624 | 0.426006817 | V | manual | 4 | 1 |
wt | Normalized numerical | Weight (lb/1000) | -1.564607761 | 0.309415603 | -0.54772305 | S | automatic | 3 | 2 |
qsec | Normalized numerical | 1/4 mile time | -0.722980874 | 0.360516446 | -1.124126363 | S | automatic | 3 | 4 |
vs | Categorical | Engine (V or straight) | 0.043834734 | -0.457097039 | -1.314395417 | S | manual | 5 | 6 |
am | Categorical | Transmission (automatic or manual) | 0.90416444 | -1.039646647 | 0.907275602 | V | manual | 4 | 1 |
gear | Numerical treated as categorical | Number of forward gears | 0.90416444 | -1.310481114 | 0.588295128 | V | manual | 4 | 1 |
carb | Numerical treated as categorical | Number of carburetors | 1.166003916 | -0.048290296 | -1.874010283 | S | manual | 5 | 4 |
The InitialSeedTable has three points that serve as initial cluster centers.
model | mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb |
---|---|---|---|---|---|---|---|---|---|---|---|
Datsun 710 | 0.449543447 | 4 | -0.990182091 | -0.783040459 | 0.473999587 | -0.917004624 | 0.426006817 | V | manual | 4 | 1 |
Ferrari Dino | -0.064813069 | 6 | -0.691647397 | 0.412942174 | 0.043834734 | -0.457097039 | -1.314395417 | S | manual | 5 | 6 |
Lincoln Continental | -1.607882616 | 8 | 1.849931752 | 0.996348337 | -1.115740088 | 2.255335698 | -0.016088927 | S | automatic | 3 | 4 |
SQL Call
DROP TABLE kmodes_clusters; SELECT * FROM KModes ( ON kmodes_input AS InputTable ON kmodes_init AS InitialSeedTable OUT TABLE OutputTable (kmodes_clusters) USING TargetColumns ('mpg:carb') NumericAsCategorical ('cyl','gear','carb') ) AS dt;
Output
With InitialSeedTable, the cluster centers and assignments are the same every time, with the same distance metric (in this case, the default, Euclidean).
+This query returns the following table:
SELECT * FROM kmodes_clusters;
cluster_id mpg disp hp drat wt qsec cyl vs am gear carb within_cluster_ss cluster_weight distance_metric category_weights ---------- ------------------ ------------------ ------------------ ----------------- ------------------ ----------------- --- -- --------- ---- ---- ----------------- -------------- ----------------- ------------------------- 1 -0.2639188168 -0.059076587 0.760068841 0.4478156392 -0.221011145 -1.2494800924 6 s manual 5 4 20.7870494221671 5.0 NULL NULL 0 -0.724943435928571 0.890010157642857 0.511912862714286 -0.9434069635 0.794435602785714 -0.180375863 8 s automatic 3 4 43.6174097337196 14.0 EUCLIDEAN,OVERLAP [1.0, 1.0, 1.0, 1.0, 1.0] 2 0.882215552923077 -0.935750713230769 -0.843624945153846 0.843739945692308 -0.770541747153846 0.674820195846154 4 v manual 4 2 48.7743732753756 13.0 NULL NULL
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.