Comparing Correlated and Noncorrelated Subqueries - Advanced SQL Engine - Teradata Database

SQL Data Manipulation Language

Product
Advanced SQL Engine
Teradata Database
Release Number
17.00
Published
September 2020
Language
English (United States)
Last Update
2021-01-23
dita:mapPath
qtb1554762060450.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata Vantage™

In the following SELECT statement, no entry in predicate_2 of the subquery references any entry in table_list_1 of the main query; therefore, the subquery is completely self-contained and is not correlated. A self-contained subquery is also referred to as a local subquery.

     SELECT column_list_1
     FROM table_list_1
     WHERE predicate_1 (SELECT column_list_2
                        FROM table_list_2
                        WHERE predicate_2);

The relationship between inner and outer query predicates is referred to as their correlation. The subquery in the previous statement is said to be uncorrelated, or noncorrelated, because it does not reference tables in the outer query. Because its subquery is local, the request restricts the number of its iterations to one. The results of the query are then joined with the results of the query made by the outer SELECT statement.

Correlated subqueries perform the subquery in parentheses once for each result row of the outer query. When a subquery references tables in the outer query that are not referenced within itself, the subquery is said to be correlated because its result is directly correlated with the outer query. A correlated subquery performs once for each value from its containing outer query. It does not necessarily produce a unique result for each of those iterations.

The following example shows the behavior of an SQL request that contains a correlated subquery.

Assume that table_1 has columns col_1 and col_2, while table_2 has columns col_3 and col_4. The following four rows exist in the two tables.

table_1     table_2  
col_1 col_2   col_3 col_4
               100 1                  100 1
                 50 1                    50 1
                 20 2                    20 2
                 40 2                    40 2

The following SELECT statement uses a correlated subquery in its WHERE clause to query table_1 and table_2.

     SELECT *
     FROM table_1
     WHERE col_1 IN (SELECT MAX(col_3)
                     FROM table_2
                     WHERE table_1.col_2=table_2.col_4);

The statement returns two response rows because the subquery is performed four times: once for each row in table_1.

The result contains only 2 response rows because of the MAX(col_3) aggregation constraint and two of the subquery executions return a response row where col_1 is not in the result.

The two rows returned are these:
  • col_1=100, col_2=1
  • col_1=40, col_2=2.

The four executions of the subquery return the following response rows:

col_3 col_4
100 1
100 1
40 2
40 2

Only the first and fourth rows of table_1 have a value for col_1 in this result set. If you had not specified the MAX aggregate function, then all four rows of table_1 would have been returned.