This example uses KMeans|| sampling. Like RandomSample Example: SamplingMode ('kmeans++'), this example treats the numeric variables cyl, gear, and carb as categorical variables and uses the categorical variables vs and am. However, this example uses the Manhattan distance metric for the numerical variables and the Hamming distance metric for the categorical variables. Because the Hamming distance metric requires categories of equal length, assume that in input table column am, 'manual' has been changed to 'manualsys' (which is the same length as 'automatic').
Input
- InputTable: fs_input1, created from fs_input (in RandomSample Example: SamplingMode ('basic'), Weighted) and populated with these statements:
CREATE TABLE fs_input1 AS ( SELECT * FROM fs_input ) WITH DATA;
UPDATE fs_input1 SET am='manualsys' WHERE am='manual';
SQL Call
SELECT * FROM RandomSample( ON fs_input1 AS InputTable USING NumSample(20) SamplingMode('KMeans||') TargetColumns('mpg:carb') categoryWeights(1000,10,100,100,100) NumericAsCategorical('cyl','gear','carb') CategoricalDistance('HAMMING') distance('MANHATTAN') Seed(1) IterNum(2) seedcolumn('model') ) AS dt ORDER BY 1,2,3;
Output
set_id mpg cyl disp hp drat wt qsec vs am gear carb ------ ----- --- ---------------- ---------------- ---------------- ------ ---------------- -- --------- ---- ---- 0 10.4 8 466.0 210.0 2.965 5.337 17.9 s automatic 3 4 0 13.8 8 355.0 245.0 3.47 3.705 15.625 s automatic 3 4 0 14.7 8 440.0 230.0 3.23 5.345 17.42 s automatic 3 4 0 15.0 8 301.0 335.0 3.54 3.57 14.6 s manualsys 5 8 0 15.35 8 311.0 150.0 2.955 3.4775 17.085 s automatic 3 2 0 15.8 8 351.0 264.0 4.22 3.17 14.5 s manualsys 5 4 0 16.4 8 275.8 180.0 3.07 4.07 17.4 s automatic 3 3 0 18.5 6 167.6 123.0 3.92 3.44 18.6 v automatic 4 4 0 18.7 8 360.0 175.0 3.15 3.44 17.02 s automatic 3 2 0 19.7 6 145.0 175.0 3.62 2.77 15.5 s manualsys 5 6 0 21.0 6 160.0 110.0 3.9 2.62 16.46 s manualsys 4 4 0 21.4 4 121.0 109.0 4.11 2.78 18.6 v manualsys 4 2 0 21.4 6 258.0 110.0 3.08 3.215 19.44 v automatic 3 1 0 21.5 4 120.1 97.0 3.7 2.465 20.01 v automatic 3 1 0 22.8 4 140.8 95.0 3.92 3.15 22.9 v automatic 4 2 0 22.8 4 108.0 93.0 3.85 2.32 18.61 v manualsys 4 1 0 24.4 4 146.7 62.0 3.69 3.19 20.0 v automatic 4 2 0 26.0 4 120.3 91.0 4.43 2.14 16.7 s manualsys 5 2 0 30.4 4 75.7 52.0 4.93 1.615 18.52 v manualsys 4 2 0 31.2 4 76.2666666666667 65.6666666666667 4.12666666666667 1.99 19.4233333333333 v manualsys 4 1
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.