1.1 - 8.10 - RandomSample Example: SamplingMode ('kmeans||') - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

This example uses KMeans|| sampling. Like RandomSample Example: SamplingMode ('kmeans++'), this example treats the numeric variables cyl, gear, and carb as categorical variables and uses the categorical variables vs and am. However, this example uses the Manhattan distance metric for the numerical variables and the Hamming distance metric for the categorical variables. Because the Hamming distance metric requires categories of equal length, assume that in input table column am, 'manual' has been changed to 'manualsys' (which is the same length as 'automatic').

Input

SQL Call

SELECT * FROM RandomSample(
  ON fs_input1 AS InputTable 
  USING
  NumSample(20)
  SamplingMode('KMeans||')
  TargetColumns('mpg:carb')
  categoryWeights(1000,10,100,100,100)
  NumericAsCategorical('cyl','gear','carb')
  CategoricalDistance('HAMMING')
  distance('MANHATTAN')
  Seed(1)
  IterNum(2)
  seedcolumn('model')
) AS dt ORDER BY 1,2,3;

Output

 set_id mpg   cyl disp             hp               drat             wt     qsec             vs am        gear carb 
 ------ ----- --- ---------------- ---------------- ---------------- ------ ---------------- -- --------- ---- ---- 
      0  10.4   8            466.0            210.0            2.965  5.337             17.9 s  automatic    3    4
      0  13.8   8            355.0            245.0             3.47  3.705           15.625 s  automatic    3    4
      0  14.7   8            440.0            230.0             3.23  5.345            17.42 s  automatic    3    4
      0  15.0   8            301.0            335.0             3.54   3.57             14.6 s  manualsys    5    8
      0 15.35   8            311.0            150.0            2.955 3.4775           17.085 s  automatic    3    2
      0  15.8   8            351.0            264.0             4.22   3.17             14.5 s  manualsys    5    4
      0  16.4   8            275.8            180.0             3.07   4.07             17.4 s  automatic    3    3
      0  18.5   6            167.6            123.0             3.92   3.44             18.6 v  automatic    4    4
      0  18.7   8            360.0            175.0             3.15   3.44            17.02 s  automatic    3    2
      0  19.7   6            145.0            175.0             3.62   2.77             15.5 s  manualsys    5    6
      0  21.0   6            160.0            110.0              3.9   2.62            16.46 s  manualsys    4    4
      0  21.4   4            121.0            109.0             4.11   2.78             18.6 v  manualsys    4    2
      0  21.4   6            258.0            110.0             3.08  3.215            19.44 v  automatic    3    1
      0  21.5   4            120.1             97.0              3.7  2.465            20.01 v  automatic    3    1
      0  22.8   4            140.8             95.0             3.92   3.15             22.9 v  automatic    4    2
      0  22.8   4            108.0             93.0             3.85   2.32            18.61 v  manualsys    4    1
      0  24.4   4            146.7             62.0             3.69   3.19             20.0 v  automatic    4    2
      0  26.0   4            120.3             91.0             4.43   2.14             16.7 s  manualsys    5    2
      0  30.4   4             75.7             52.0             4.93  1.615            18.52 v  manualsys    4    2
      0  31.2   4 76.2666666666667 65.6666666666667 4.12666666666667   1.99 19.4233333333333 v  manualsys    4    1

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.