Sampling Methods - Teradata Database

SQL Data Manipulation Language

Product
Teradata Database
Release Number
15.00
Language
English (United States)
Last Update
2018-09-28
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata® Database

Sampling Methods

Random Sampling

Teradata Database supports extracting a random sample from a database table using the SAMPLE clause and specifying one of the following:

  • The number rows
  • A fraction of the total number of rows
  • A set of fractions as the sample
  • This sampling method assumes that rows are sampled without replacement and that they are not reconsidered when another sample of the population is taken. This method results in mutually exclusive samples when you request multiple samples. In addition, the random sampling method assumes proportional allocation of rows across the AMPs in the system.

    Random Stratified Sampling

    In addition to random sampling option, Teradata Database supports stratified sampling.

    Random Stratified Sampling, also called proportional or quota random sampling, involves dividing the population into homogeneous subgroups and taking a random sample in each subgroup. Stratified sampling represents both the overall population and key subgroups of the population. The fraction specification for stratified sampling refers to the fraction of the total number of rows in the stratum.

    The following apply to stratified sampling.

     

    You can specify…

    You cannot specify…

    stratified sampling in derived tables, views, and macros

    stratified sampling with set operations or subqueries

    either a fraction or an integer as the sample size for every stratum

    fraction and integer combinations

    up to 16 mutually exclusive samples for each stratum