RandomSample Example: SamplingMode ('kmeans||') - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example uses KMeans|| sampling. Like RandomSample Example: SamplingMode ('kmeans++'), this example treats the numeric variables cyl, gear, and carb as categorical variables and uses the categorical variables vs and am. However, this example uses the Manhattan distance metric for the numerical variables and the Hamming distance metric for the categorical variables. Because the Hamming distance metric requires categories of equal length, assume that in input table column am, 'manual' has been changed to 'manualsys' (which is the same length as 'automatic').

Input

SQL Call

SELECT * FROM RandomSample(
  ON fs_input1 AS InputTable 
  USING
  NumSample(20)
  SamplingMode('KMeans||')
  TargetColumns('mpg:carb')
  categoryWeights(1000,10,100,100,100)
  NumericAsCategorical('cyl','gear','carb')
  CategoricalDistance('HAMMING')
  distance('MANHATTAN')
  Seed(1)
  IterNum(2)
  seedcolumn('model')
) AS dt ORDER BY 1,2,3;

Output

 set_id mpg   cyl disp             hp               drat             wt     qsec             vs am        gear carb 
 ------ ----- --- ---------------- ---------------- ---------------- ------ ---------------- -- --------- ---- ---- 
      0  10.4   8            466.0            210.0            2.965  5.337             17.9 s  automatic    3    4
      0  13.8   8            355.0            245.0             3.47  3.705           15.625 s  automatic    3    4
      0  14.7   8            440.0            230.0             3.23  5.345            17.42 s  automatic    3    4
      0  15.0   8            301.0            335.0             3.54   3.57             14.6 s  manualsys    5    8
      0 15.35   8            311.0            150.0            2.955 3.4775           17.085 s  automatic    3    2
      0  15.8   8            351.0            264.0             4.22   3.17             14.5 s  manualsys    5    4
      0  16.4   8            275.8            180.0             3.07   4.07             17.4 s  automatic    3    3
      0  18.5   6            167.6            123.0             3.92   3.44             18.6 v  automatic    4    4
      0  18.7   8            360.0            175.0             3.15   3.44            17.02 s  automatic    3    2
      0  19.7   6            145.0            175.0             3.62   2.77             15.5 s  manualsys    5    6
      0  21.0   6            160.0            110.0              3.9   2.62            16.46 s  manualsys    4    4
      0  21.4   4            121.0            109.0             4.11   2.78             18.6 v  manualsys    4    2
      0  21.4   6            258.0            110.0             3.08  3.215            19.44 v  automatic    3    1
      0  21.5   4            120.1             97.0              3.7  2.465            20.01 v  automatic    3    1
      0  22.8   4            140.8             95.0             3.92   3.15             22.9 v  automatic    4    2
      0  22.8   4            108.0             93.0             3.85   2.32            18.61 v  manualsys    4    1
      0  24.4   4            146.7             62.0             3.69   3.19             20.0 v  automatic    4    2
      0  26.0   4            120.3             91.0             4.43   2.14             16.7 s  manualsys    5    2
      0  30.4   4             75.7             52.0             4.93  1.615            18.52 v  manualsys    4    2
      0  31.2   4 76.2666666666667 65.6666666666667 4.12666666666667   1.99 19.4233333333333 v  manualsys    4    1

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.