This example selects a sample of approximately 20% of the rows in the input table.
Input
The input table, score_category, categorizes the students (in the table students) by score in a given subject.
id | score |
---|---|
1 | 5 |
2 | 83 |
3 | 95 |
4 | 95 |
5 | 90 |
6 | 55 |
7 | 40 |
8 | 57 |
9 | 65 |
10 | 27 |
... | ... |
This statement creates score_category from students:
CREATE MULTISET TABLE score_category AS ( SELECT *, CASE WHEN score <= 80 THEN 'fair' WHEN score > 80 AND score < 90 THEN 'very good' WHEN score >= 90 THEN 'excellent' END AS stratum FROM students ) WITH DATA;
id | score | stratum |
---|---|---|
1 | 5 | fair |
2 | 83 | very good |
3 | 95 | excellent |
4 | 95 | excellent |
5 | 90 | excellent |
6 | 55 | fair |
7 | 40 | fair |
8 | 57 | fair |
9 | 65 | fair |
10 | 27 | fair |
SQL Call
SELECT * FROM Sampling(
ON students PARTITION BY any
USING
SampleFraction(0.2)
Seed(2)
) AS dt ORDER BY id;
Output
id score -- ----- 4 95.0 10 27.0 16 19.0 23 14.0 26 54.0 27 44.0 28 90.0 33 85.0 35 94.0 42 39.0 43 49.0 57 75.0 60 97.0 71 59.0 86 6.0 87 18.0 93 67.0
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.