Exclusion Merge Join Process
- The left and right tables are distributed and sorted based on the row hash values of the join columns.
- For each left table row, read all right table rows having the same row hash value until one is found that matches the join condition.
- Produce the join result.
If no matching right table rows are found, return the left row.
Examples of Exclusion Merge Joins
The following SELECT request is an example for which the Optimizer uses an exclusion merge join for its join plan:
SELECT name FROM employee WHERE dept_no NOT IN (SELECT dept_no FROM department WHERE loc <> ‘CHI’);
The following stages document a concrete example of the exclusion join process:
- All AMPs are searched for department rows where loc <> ' CHI '.
- The multiple rows found to satisfy this condition, that for department 600, is placed in a spool on the same AMP.
- The spool containing the single department row is redistributed.
- The rows in the two spools undergo an exclusion merge join on each AMP.
- Name information for any employee row whose dept_no is not 600 is placed in a result spool on each AMP.
When the last AMP has completed its portion of the join, the contents of all result spools are sent to the user by means of a BYNET merge.
The processing stages of an exclusion merge join are like those used in the exclusion product join (see Exclusion Join), with the following exceptions:
- Multiple rows are retrieved in stage 2.
- Stages 2 and 4 are combined in the exclusion merge join, and redistribution occurs instead of duplication.
- Stage 3 is removed.
- Stage 5 is changed to an exclusion product join.
Consider the following employee and customer tables for the next exclusion merge join example:
e_num | name | job_code |
---|---|---|
1 | Brown | 512101 |
2 | Smith | 412101 |
3 | Jones | 512101 |
4 | Clay | 412101 |
5 | Peters | 512101 |
6 | Foster | 512101 |
7 | Gray | 413201 |
8 | Baker | 512101 |
Column e_num is the UPI and PK, and job_code is a FK.
cust | sales_enum |
---|---|
23 | 6 |
24 | 3 |
25 | 8 |
26 | 1 |
27 | 6 |
28 | 8 |
29 | 1 |
30 | 6 |
Column cust is the UPI and PK, and sales_enum is a FK.
The graphic that follows illustrates the row redistribution caused by the following SELECT request:
SELECT name FROM employee WHERE job_code = 512101 AND e_num NOT IN (SELECT sales_enum FROM customer);