There are several benefits in defining non-aggregate join indexes with outer joins:
- Unmatched rows are preserved.
These rows allow the join index to satisfy queries with fewer join conditions than those used to generate the index.
- Outer table row scans can provide the same performance benefits as a single-table join index.
For example, the Optimizer can choose to scan the outer table rows of a join index to satisfy a query that only references the outer table provided that a join index scan would be more high-performing than scanning the base table or redistributing rows.
Redefined Join Index
The following example changes the previous join index example (see Defining a Simple Join Index on a Binary Join Result) to use an Outer Join in the join index definition.
CREATE JOIN INDEX OrdCustIdx AS SELECT (o_custkey,c_name) , (o_status,o_date,o_comment) FROM orders LEFT JOIN customer ON o_custkey=c_custkey;
Materialized Join Index
The resulting join index rows would be the following (where the ? character indicates a null).
Fixed Part | Repeated Part | |||
---|---|---|---|---|
CustKey | Name | Status | Date | Comment |
100 | Robert | S | 2004-10-01 | big order |
S | 2004-10-05 | credit | ||
101 | Ann | P | 2004-10-08 | discount |
102 | Don | S | 2004-10-01 | rush order |
D | 2004-10-03 | delayed | ||
? | ? | U | 2004-10-05 | unknown customer |
With the join index defined in this way, the following query could be resolved using just the join index, without having to scan the base tables.
SELECT o_status, o_date, o_comment FROM orders;
In this particular case, it is more efficient to access the join index than it is to access the orders base table. This is true whenever the cost of scanning the join index is less than the cost of scanning the orders table. The Optimizer evaluates both access methods, choosing the more efficient, less costly of the two for its query plan.