This example selects a sample of approximately 20% of the rows in the input table.
Input
The input table, score_category, categorizes the students (in the table students) by score in a given subject.
id | score |
---|---|
1 | 5 |
2 | 83 |
3 | 95 |
4 | 95 |
5 | 90 |
6 | 55 |
7 | 40 |
8 | 57 |
9 | 65 |
10 | 27 |
... | ... |
This statement creates score_category from students:
CREATE MULTISET TABLE score_category AS ( SELECT *, CASE WHEN score <= 80 THEN 'fair' WHEN score > 80 AND score < 90 THEN 'very good' WHEN score >= 90 THEN 'excellent' END AS stratum FROM students ) WITH DATA;
id | score | stratum |
---|---|---|
1 | 5 | fair |
2 | 83 | very good |
3 | 95 | excellent |
4 | 95 | excellent |
5 | 90 | excellent |
6 | 55 | fair |
7 | 40 | fair |
8 | 57 | fair |
9 | 65 | fair |
10 | 27 | fair |
SQL Call
SELECT * FROM Sampling (
ON students PARTITION BY ANY
USING
SampleFraction (0.2)
Seed (2)
) AS dt ORDER BY id;
Output
id | score |
---|---|
4 | 95 |
6 | 55 |
15 | 22 |
16 | 19 |
21 | 5 |
22 | 25 |
27 | 44 |
32 | 3 |
36 | 15 |
39 | 50 |
44 | 70 |
47 | 79 |
65 | 53 |
67 | 29 |
68 | 30 |
69 | 18 |
74 | 7 |
79 | 71 |
81 | 13 |
83 | 81 |
85 | 79 |
92 | 32 |