Sample | Vantage Analytics Library - Sample - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Lake
Product
Vantage Analytics Library
Release Number
2.2.0
Published
June 2025
ft:locale
en-US
ft:lastEdition
2025-07-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage
The Sample analysis function randomly selects rows from a table or view producing one or more samples based on a specified number of rows or a fraction of the total number of rows. The sampled rows can be stored in the following ways:
  • Single table
  • Separate table for each sample
  • Single table with a view created for each sample
Options are provided for sampling with or without the replacement of rows, randomized or proportional allocation by AMP, and stratified or simple random sampling and are described in the following table:
Option Description
With or without replacement of rows
  • Without replacement

    Sampling of rows is performed without row replacement by default. Each sampled row in a request is unique. Once a row is sampled, it is not replaced in the sampling pool for that request. Therefore, it is not possible to sample more rows than what exists in the sampled table. If multiple samples are requested, they are mutually exclusive.

  • With replacement

    Each sampled row is immediately returned to the sampling pool and can be selected multiple times. If multiple samples are requested and with replacements is selected, the samples are not necessarily mutually exclusive.

Randomized or proportional allocation by AMP
  • Proportional allocation

    Sampling of rows is performed with proportional allocation by default. Requested rows are allocated across the AMPs as a function of the number of rows on each AMP. This is not considered a simple random sample since it does not include all of the possible sample sets. This option is much faster than the randomized allocation option, especially for large sample sizes, and still result with enough of a random allocation for most applications.

  • Randomized allocation

    Request rows are allocated across the AMPs by simulating simple random sampling, a process that can be comparatively slower than proportional allocation.

Stratified or simple random sampling
  • Simple

    Sampling of rows is performed with simple random sampling by default. Each possible set of the requested sample size has an equal probability of being selected (subject to the limitations of proportional allocation discussed previously).

  • Stratified

    Available rows are divided into groups or strata. This division is based on conditions defined prior to samples of a requested size being taken.

The Sample analysis function is defined by specifying the parameters of the table and columns to analyze. Each Sample example contains the td_analyze call statement, the generated SQL, and expected results.