1.1 - 8.10 - RandomSample Example: SamplingMode ('kmeans++') - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

This example uses KMeans++ sampling with the Manhattan distance metric, and treats the numeric variables cyl, gear, and carb as categorical variables (vs and am are also categorical variables). The category weights are assigned in the order that the columns appear in the input table: 1000 to cyl, 10 to vs, 100 to am, 100 to gear, and 100 to carb.

This example also specifies SetIDAsFirstColumn ('false'), causing the function-generated set_id column to appear last in the output table.

SQL Call

SELECT * FROM RandomSample(
  ON fs_input AS InputTable
  USING
  NumSample(10)
  SamplingMode('KMeans++')
  TargetColumns('mpg:carb')
  CategoryWeights(1000,10,100,100,100)
  NumericAsCategorical('cyl','gear','carb')
  Distance('manhattan')
  Seed(1)
  SeedColumn('model')
  SetIdAsFirstColumn('false')
) AS dt ORDER BY 1,2,3;

Output

 sn model               mpg  cyl disp  hp    drat wt    qsec  vs am        gear carb set_id 
 -- ------------------- ---- --- ----- ----- ---- ----- ----- -- --------- ---- ---- ------ 
  1 mazda rx4           21.0   6 160.0 110.0  3.9  2.62 16.46 s  manual       4    4      0
  3 datsun 710          22.8   4 108.0  93.0 3.85  2.32 18.61 v  manual       4    1      0
  5 hornet sportabout   18.7   8 360.0 175.0 3.15  3.44 17.02 s  automatic    3    2      0
 14 merc 450slc         15.2   8 275.8 180.0 3.07  3.78  18.0 s  automatic    3    3      0
 16 lincoln continental 10.4   8 460.0 215.0  3.0 5.424 17.82 s  automatic    3    4      0
 20 toyota corolla      33.9   4  71.1  65.0 4.22 1.835  19.9 v  manual       4    1      0
 22 dodge challenger    15.5   8 318.0 150.0 2.76  3.52 16.87 s  automatic    3    2      0
 28 lotus europa        30.4   4  95.1 113.0 3.77 1.513  16.9 v  manual       5    2      0
 31 maserati bora       15.0   8 301.0 335.0 3.54  3.57  14.6 s  manual       5    8      0
 32 volvo 142e          21.4   4 121.0 109.0 4.11  2.78  18.6 v  manual       4    2      0

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.