Purpose
Identifies the sample to which a row belongs, distinguishing rows belonging to different samples specified in the SAMPLE clause of a SELECT statement.
Syntax
Syntax Elements
- SAMPLEID
- Select list or ORDER BY clause expression that indicates that sample identifiers are to be used to link result set rows with their originating samples.
ANSI Compliance
SAMPLEID is a Teradata extension to the ANSI SQL:2011 standard.
Definition of a Sample ID
The sample ID identifies the sample to which a row belongs in the left-to-right order of the SAMPLE clause specification, from 1 through n (where n is the number of samples requested in the SAMPLE clause).
Rules and Restrictions for SAMPLEID
The rules and restrictions are:
- You can only specify SAMPLEID with a SAMPLE clause, which can appear either as part of a select list or as an ORDER BY clause expression.
- SAMPLEID cannot be the only term specified in the select list. If you specify SAMPLEID, you must also specify at least one other non-SAMPLEID column expression.
Using SAMPLEID With Stratified Sampling
The SAMPLEID value for stratified sampling is simply 1, 2, 3, … n across n specified samples regardless of stratification. That is, for the following SAMPLE clause,
SAMPLE WHEN state = 'CA' THEN |
0.3, |
0.2 |
ELSE |
0.5, |
0.2 |
|
the SAMPLEID correspondence would be: | 1 | 2 | 3 | 4 |
Example: Three Sample Sets Using SAMPLEID as a Select List Expression
The following SELECT statement provides three mutually exclusive sample sets (60 percent for sample category x, 25 percent for sample category y, and 15 percent for sample category z) from a customer table.
The results are returned in a table with three columns: cust_name, cust_addr, and SAMPLEID.
The integer in the SAMPLEID column identifies whether the row belongs to the 0.6 sample, the 0.25 sample, or the 0.15 sample.
The samples are identified as 1 through 3, in left-to-right order from the SAMPLE clause, so 0.6 is identified by 1, 0.25 by 2, and 0.15 by 3.
SELECT cust_name, cust_addr, SAMPLEID FROM customer_table SAMPLE 0.6, 0.25, 0.15;
A partial results table might look something like the following:
cust_name | cust_addr | SAMPLEID |
---|---|---|
Jones Pharmaceuticals | 4235 Lawler Road Memphis, TN USA | 1 |
Fong Furniture | 310 East Highway 5 Hong Kong | 2 |
Subramaniam Spice Exports | 455 1/2 Gandhi Lane Hyderabad India | 3 |
Forrester Property Management | 1 West Broadway Syracuse, New York USA | 1 |
Otomo Consulting | 33 Korakuen Hall Tokyo Japan | 1 |
Adler Music Publishing, Ltd. | 5 East 245th Street Nashville, TN USA | |
O’Brien Metals | 83 Heatherington The Whithers Cobblestone-on-Treads United Kingdom | 1 |
Irama Rice Importers | 8562 Rhoma Lane Jakarta Indonesia | 2 |
Abdelwahab Fine Egyptian Rugs | 1723 Kulthum Avenue Cairo Egypt | 1 |
Bachar Ouds | 18 Rashied Diagonal Baghdad Iraq | 1 |
Example: Using SAMPLEID With PERIOD Value Expressions
The following example shows how you can specify a PERIOD value expression with a SAMPLEID expression, where period_of_stay is the PERIOD value expression.
SELECT emp_no, period_of_stay, SAMPLEID FROM employee SAMPLE 0.5,0.5;