Outer references behave as described in the following process. This process does not mirror the query plan generated by the Optimizer. Instead, this process describes how a correlated subquery works at a conceptual level.
- For each row of an outer query, the values of the outer references in that row are used to evaluate the result of the inner subquery.
- The inner subquery expression result is joined with its associated row from the outer query based on the specified join constraint for the inner query.
The semantics of correlated subqueries imply that an inner subquery expression is executed once for each row of its immediate outer query expression. The semantics do not guarantee that each iteration of the subquery produces a unique result.
The following example of this behavior uses the following simple employee table.
employee | |||
---|---|---|---|
emp_no | emp_name | gender | age |
PI | NN | ||
101 | Friedrich | F | 23 |
102 | Harvey | M | 47 |
103 | Agrawal | M | 65 |
104 | Valduriez | M | 34 |
105 | Cariño | F | 39 |
106 | Au | M | 28 |
107 | Takamoto | F | 51 |
108 | Ghazal | F | 26 |
The following SELECT statement shows the behavior of a simple correlated subquery. Because the same table is referenced in both the inner and outer queries, both references are given correlation names for clarity, though only one of them (it makes no difference which) must be aliased.
SELECT * FROM employee AS e1 WHERE age < (SELECT MAX(age) FROM employee AS e2 WHERE e1.gender = e2.gender);
Here is the process:
- Two copies of the table described earlier are generated, one as e1 and the other as e2.
- Evaluation of the inner query requires data from the outer, containing, query.
The evaluation of the inner query becomes a set of virtual individual queries such as the following:
SELECT 101, 'Friedrich', 'F', 23 FROM employee AS e1 WHERE 23 < (SELECT MAX(age) FROM employee AS e2 WHERE 'F' = e2.gender; … SELECT 108, 'Ghazal', 'F', 26 FROM employee as e1 WHERE 26 < (SELECT MAX(age) FROM employee AS e2 WHERE 'F' = e2.gender;
- The expanded individual queries once the subquery has been evaluated for each row in the inner query look like the following:
SELECT 101, 'Friedrich', F, 23 FROM employee AS e1 WHERE 23 < 51; SELECT 102, 'Harvey', M, 47 FROM employee AS e1 WHERE 47 < 65; SELECT 103, 'Agrawal', M, 65 FROM employee AS e1 WHERE 65 < 65; SELECT 104, 'Valduriez', M, 34 FROM employee AS e1 WHERE 34 < 65; SELECT 105, 'Cariño', F, 39 FROM employee AS e1 WHERE 39 < 51; SELECT 106, 'Au', M, 28 FROM employee AS e1 WHERE 28 < 65; SELECT 107, 'Takamoto', F, 51 FROM employee AS e1 WHERE 51 < 51; SELECT 108, 'Ghazal', F, 26 FROM employee AS e1 WHERE 26 < 51;
- Vantage performs the same evaluations for each row in the outer query.
- Employee 103, Agrawal, is eliminated from the result because his age is not less than the maximum male age in the table. Similarly, employee 107, Takamoto, is eliminated because her age is not less than the maximum female age in the table.
The final result is reported in the following table.
emp_no emp_name gender age ----- ------- ------ --- 101 Friedrich F 23 102 Harvey M 47 104 Valduriez M 34 105 Cariño F 39 106 Au M 28 108 Ghazal F 26