Sampling Methods - Teradata Database - Teradata Vantage NewSQL Engine

SQL Data Manipulation Language

Product
Teradata Database
Teradata Vantage NewSQL Engine
Release Number
16.20
Published
March 2019
Language
English (United States)
Last Update
2019-05-03
dita:mapPath
fbo1512081269404.ditamap
dita:ditavalPath
TD_DBS_16_20_Update1.ditaval
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata Vantage™

Random Sampling

Teradata Database supports extracting a random sample from a database table using the SAMPLE clause and specifying one of the following:
  • The number rows
  • A fraction of the total number of rows
  • A set of fractions as the sample

This sampling method assumes that rows are sampled without replacement and that they are not reconsidered when another sample of the population is taken. This method results in mutually exclusive samples when you request multiple samples. In addition, the random sampling method assumes proportional allocation of rows across the AMPs in the system.

Random Stratified Sampling

In addition to random sampling option, Teradata Database supports stratified sampling.

Random Stratified Sampling, also called proportional or quota random sampling, involves dividing the population into homogeneous subgroups and taking a random sample in each subgroup. Stratified sampling represents both the overall population and key subgroups of the population. The fraction specification for stratified sampling refers to the fraction of the total number of rows in the stratum.

The following apply to stratified sampling.

You can specify… You cannot specify…
stratified sampling in derived tables, views, and macros stratified sampling with set operations or subqueries
either a fraction or an integer as the sample size for every stratum fraction and integer combinations
up to 16 mutually exclusive samples for each stratum