A subquery is correlated when it references columns of outer tables in an enclosing or containing outer query. The expression correlated subquery comes from the explicit requirement for the use of correlation names in any correlated subquery in which the same table is referenced in both the internal and external query.
The reference from the inner query to its containing query is shown in this illustration.
When the inner table references a foreign key column of a different table, the column must be fully qualified in the interior WHERE clause.
The effect of a correlated subquery is to provide an implicit loop function within any standard SQL DML statement, including ABORT, DELETE, INSERT, and UPDATE.
A reference to variables in an outer expression by a variable in an inner expression is called an outer reference.
IF a scalar subquery references tables in its containing query that are … |
THEN it is said to be … |
not referenced in its own FROM clause |
correlated. |
referenced in its own FROM clause |
noncorrelated. |
If a column reference is made from an inner query to an outer query, then it must be fully qualified with the appropriate table name from the FROM clause of the outer query.
Correlated subqueries are ANSI SQL:2011-compliant.
The rules and restrictions are:
For example, this SELECT statement specifies an equality condition in its subquery and specifies an inequality condition in the main query:
SELECT *
FROM table_1
WHERE x < ALL (SELECT y
FROM table_2
WHERE table_2.n = table_1.n);
For example, this SELECT statement specifies three levels of depth:
SELECT *
FROM table_1
WHERE x = (SELECT y
FROM table_2
WHERE table_2.n = table_1.n
AND t2.m = (SELECT y
FROM table_3
WHERE table_3.n = table_2.m
AND table_1.l = table_3.l));
To differentiate between the references to the inner and outer tables, one of them must be renamed with a correlation name.
In the first example, the outer table is aliased, while in the second example, the inner table is aliased.
Also note that in the first example, the outer reference is a.n
, while in the second example, the outer reference is table_1.n
.
SELECT *
FROM table_1 AS a
WHERE x < (SELECT AVG(table_1.x)
FROM table_1
WHERE table_1.n = a.n);
SELECT *
FROM table_1
WHERE x < (SELECT AVG(a.x)
FROM table_1 AS a
WHERE table_1.n = a.n);
IF a SELECT is specified in … |
THEN … |
the main query |
any table referenced in the main query should be specified in the FROM clause of the main query SELECT. |
a subquery |
any table referenced in that subquery must be specified either in the FROM clause of that query or in some outer query. If the select expression list of the subquery specifies a column reference to a table in the FROM clause of an outer query, the column reference must be fully qualified. If the correlated condition specifies an unqualified column reference, then Teradata Database searches for the column starting with the tables and views in the current subquery toward the tables and views in the outermost query. |
IF a table referenced in a subquery is identified in the FROM clause of … |
THEN the table reference is … |
a subquery |
local and the subquery is noncorrelated. |
an outer query |
not local and the subquery is correlated. |
The following rules apply to specifying a FROM clause in a correlated subquery.
Teradata Database returns a warning message if you do not specify a referenced table in the FROM clause.
The EXISTS quantifier is supported as the predicate of a search condition in a WHERE clause.
EXISTS tests the result of the subquery. The subquery is performed for each row of the table specified in the outer query when the subquery is correlated.
If the subquery returns response rows, then its WHERE condition is satisfied.
Outer references behave as described in the following process. This process does not mirror the query plan generated by the Optimizer. Instead, this process describes how a correlated subquery works at a conceptual level.
1 For each row of an outer query, the values of the outer references in that row are used to evaluate the result of the inner subquery.
2 The inner subquery expression result is joined with its associated row from the outer query based on the specified join constraint for the inner query.
The semantics of correlated subqueries imply that an inner subquery expression is executed once for each row of its immediate outer query expression. The semantics do not guarantee that each iteration of the subquery produces a unique result.
The following example of this behavior uses the following simple employee table.
employee |
|
|
|
emp_no |
emp_name |
sex |
age |
PI |
NN |
||
101 |
Friedrich |
F |
23 |
102 |
Harvey |
M |
47 |
103 |
Agrawal |
M |
65 |
104 |
Valduriez |
M |
34 |
105 |
Cariño |
F |
39 |
106 |
Au |
M |
28 |
107 |
Takamoto |
F |
51 |
108 |
Ghazal |
F |
26 |
The following SELECT statement shows the behavior of a simple correlated subquery. Because the same table is referenced in both the inner and outer queries, both references are given correlation names for clarity, though only one of them (it makes no difference which) must be aliased.
SELECT *
FROM employee AS e1
WHERE age < (SELECT MAX(age)
FROM employee AS e2
WHERE e1.sex = e2.sex);
Here is the process:
1 Two copies of the table described earlier are generated, one as e1 and the other as e2.
2 Evaluation of the inner query requires data from the outer, containing, query.
The evaluation of the inner query becomes a set of virtual individual queries such as the following:
SELECT 101, 'Friedrich', 'F', 23
FROM employee AS e1
WHERE 23 < (SELECT MAX(age)
FROM employee AS e2
WHERE 'F' = e2.sex;
…
SELECT 108, 'Ghazal', 'F', 26
FROM employee as e1
WHERE 26 < (SELECT MAX(age)
FROM employee AS e2
WHERE 'F' = e2.sex;
3 The expanded individual queries once the subquery has been evaluated for each row in the inner query look like the following:
SELECT 101, 'Friedrich', F, 23
FROM employee AS e1
WHERE 23 < 51;
SELECT 102, 'Harvey', M, 47
FROM employee AS e1
WHERE 47 < 65;
SELECT 103, 'Agrawal', M, 65
FROM employee AS e1
WHERE 65 < 65;
SELECT 104, 'Valduriez', M, 34
FROM employee AS e1
WHERE 34 < 65;
SELECT 105, 'Cariño', F, 39
FROM employee AS e1
WHERE 39 < 51;
SELECT 106, 'Au', M, 28
FROM employee AS e1
WHERE 28 < 65;
SELECT 107, 'Takamoto', F, 51
FROM employee AS e1
WHERE 51 < 51;
SELECT 108, 'Ghazal', F, 26
FROM employee AS e1
WHERE 26 < 51;
4 Teradata Database performs the same evaluations for each row in the outer query.
5 Employee 103, Agrawal, is eliminated from the result because his age is not less than the maximum male age in the table. Similarly, employee 107, Takamoto, is eliminated because her age is not less than the maximum female age in the table.
The final result is reported in the following table.
emp_no emp_name sex age
----- ------- --- ---
101 Friedrich F 23
102 Harvey M 47
104 Valduriez M 34
105 Cariño F 39
106 Au M 28
108 Ghazal F 26
To perform a subquery as an noncorrelated subquery and not as a correlated subquery, observe the following guidelines.
In the following SELECT statement, no entry in predicate_2 of the subquery references any entry in table_list_1 of the main query; therefore, the subquery is completely self-contained and is not correlated. A self-contained subquery is also referred to as a local subquery.
SELECT column_list_1
FROM table_list_1
WHERE predicate_1 (SELECT column_list_2
FROM table_list_2
WHERE predicate_2);
The relationship between inner and outer query predicates is referred to as their correlation. The subquery in the previous statement is said to be uncorrelated, or noncorrelated, because it does not reference tables in the outer query. Because its subquery is local, the request restricts the number of its iterations to one. The results of the query are then joined with the results of the query made by the outer SELECT statement.
Correlated subqueries perform the subquery in parentheses once for each result row of the outer query. When a subquery references tables in the outer query that are not referenced within itself, the subquery is said to be correlated because its result is directly correlated with the outer query. A correlated subquery performs once for each value from its containing outer query. It does not necessarily produce a unique result for each of those iterations.
The following example shows the behavior of an SQL request that contains a correlated subquery.
Assume that table_1 has columns col_1 and col_2, while table_2 has columns col_3 and col_4. The following four rows exist in the two tables.
table_1 |
|
|
table_2 |
|
col_1 |
col_2 |
|
col_3 |
col_4 |
100 |
1 |
|
100 |
1 |
50 |
1 |
|
50 |
1 |
20 |
2 |
|
20 |
2 |
40 |
2 |
|
40 |
2 |
The following SELECT statement uses a correlated subquery in its WHERE clause to query table_1 and table_2.
SELECT *
FROM table_1
WHERE col_1 IN (SELECT MAX(col_3)
FROM table_2
WHERE table_1.col_2=table_2.col_4);
The statement returns two response rows because the subquery is performed four times: once for each row in table_1.
The result contains only 2 response rows because of the MAX(col_3)
aggregation constraint (two of the subquery executions return a response row where
col_1 is not in the result).
The two rows returned are these:
The four executions of the subquery return the following response rows:
col_3 |
col_4 |
100 |
1 |
100 |
1 |
40 |
2 |
40 |
2 |
Only the first and fourth rows of table_1 have a value for col_1 in this result set. If you had not specified the MAX aggregate function, then all four rows of table_1 would have been returned.