Sampling Methods - Analytics Database - Teradata Vantage

SQL Data Manipulation Language

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
ft:locale
en-US
ft:lastEdition
2024-12-13
dita:mapPath
pon1628111750298.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
esx1472246586715
lifecycle
latest
Product Category
Teradata Vantage™

Random Sampling

Vantage supports extracting a random sample from a database table using the SAMPLE clause and specifying one of the following:
  • The number rows
  • A fraction of the total number of rows
  • A set of fractions as the sample

This sampling method assumes that rows are sampled without replacement and that they are not reconsidered when another sample of the population is taken. This method results in mutually exclusive samples when you request multiple samples. In addition, the random sampling method assumes proportional allocation of rows across the AMPs in the system.

Random Stratified Sampling

In addition to random sampling option, Vantage supports stratified sampling.

Random Stratified Sampling, also called proportional or quota random sampling, involves dividing the population into homogeneous subgroups and taking a random sample in each subgroup. Stratified sampling represents both the overall population and key subgroups of the population. The fraction specification for stratified sampling refers to the fraction of the total number of rows in the stratum.

The following apply to stratified sampling.

You can specify… You cannot specify…
stratified sampling in derived tables, views, and macros stratified sampling with set operations or subqueries
either a fraction or an integer as the sample size for every stratum fraction and integer combinations
up to 16 mutually exclusive samples for each stratum