Sampling Example: Conditional, Variable Sample Rates - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example applies sampling rates 20%, 30%, and 40% to categories fair, very good, and excellent, respectively, and rounds the number sampled to the nearest integer.

SQL Call

The syntax elements StratumColumn and Strata always appear together.

SELECT * FROM Sampling(
  ON score_category PARTITION BY ANY
  USING
  SampleFraction(0.2, 0.3, 0.4)
  StratumColumn('stratum')
  Strata('fair', 'very good', 'excellent')
  Seed(2)
) AS dt ORDER BY stratum, id, score;

Output

 id  score stratum   
 --- ----- --------- 
   5  90.0 excellent
  35  94.0 excellent
  60  97.0 excellent
  70  95.0 excellent
  76 100.0 excellent
  78  91.0 excellent
  90 100.0 excellent
  10  27.0 fair     
  14  52.0 fair     
  23  14.0 fair     
  27  44.0 fair     
  32   3.0 fair     
  45  21.0 fair     
  65  53.0 fair     
  71  59.0 fair     
  72  79.0 fair     
 100  18.0 fair     
   2  83.0 very good
  20  85.0 very good
  53  87.0 very good
  83  81.0 very good

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.