1.0 - 8.00 - RandomSample Example 3: KMeans|| Sampling - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)

This example uses KMeans|| sampling. Like Example 2, this example treats the numeric variables cyl, gear, and carb as categorical variables and uses the categorical variables vs and am. However, this example uses the Manhattan distance metric for the numerical variables and the Hamming distance metric for the categorical variables. Because the Hamming distance metric requires categories of equal length, assume that in input table column am, 'manual' has been changed to 'manualsys' (which is the same length as 'automatic').

Input

  • InputTable: fs_input1, created from fs_input (in RandomSample Example 1: Basic Sampling (Weighted)) and populated with these statements:
    CREATE MULTISET TABLE fs_input1 AS (
      SELECT * FROM fs_input
    ) WITH DATA;
    
    UPDATE fs_input1 SET am='manualsys' WHERE am='manual';

SQL Call

SELECT * FROM RandomSample (
  ON fs_input1 AS InputTable
  USING
  NumSample (20)
  SamplingMode ('kmeans||')
  InputColumns ('mpg:carb')
  CategoryWeights (1000, 10, 100, 100, 100)
  AsCategories ('cyl' ,'gear', 'carb')
  CategoricalDistance ('hamming')
  Distance ('manhattan')
  Seed (1)
  IterationNum (2)
  SeedColumn ('model')
) AS dt ORDER BY 1,2,3;

Output

set_id mpg cyl disp hp drat wt qsec vs am gear carb
0 12.42 8 414.4 228 3.324 4.7398 16.808 S automatic 3 4
0 15.8 8 351 264 4.22 3.17 14.5 S manualsys 5 4
0 17.225 8 349 162.5 2.9375 3.58125 16.9525 S automatic 3 2
0 17.3 8 275.8 180 3.07 3.73 17.6 S automatic 3 3
0 19.2 6 67.6 123 3.92 3.44 18.3 V automatic 4 4
0 19.7 6 145 175 3.62 2.77 15.5 S manualsys 5 6
0 21.4 4 121 109 4.11 2.78 18.6 V manualsys 4 2
0 21.4 6 258 110 3.08 3.215 19.44 V automatic 3 1
0 21.5 4 120.1 97 3.7 2.465 20.01 V automatic 3 1
0 23.6 4 143.75 78.5 3.805 3.17 21.45 V automatic 4 2