Sampling Example 2: Conditional Sampling, Variable Sample Rates - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example applies sampling rates 20%, 30%, and 40% to categories fair, very good, and excellent, respectively, and rounds the number sampled to the nearest integer.

SQL Call

The arguments StratumColumn and Strata always appear together.

SELECT * FROM Sampling (
  ON score_category PARTITION BY ANY
  USING
  SampleFraction (0.2, 0.3, 0.4)
  StratumColumn ('stratum')
  Strata ('fair', 'very good', 'excellent')
  Seed (2)
) AS dt ORDER BY stratum, id, score;

Output

id score stratum
12 93 excellent
28 90 excellent
60 97 excellent
78 91 excellent
90 100 excellent
8 57 fair
10 27 fair
21 5 fair
24 11 fair
27 44 fair
32 3 fair
37 14 fair
42 39 fair
46 19 fair
49 8 fair
54 43 fair
61 6 fair
79 71 fair
81 13 fair
85 79 fair
94 76 fair
99 44 fair
100 18 fair
20 85 verygood
95 84 verygood