16.10 - Using Outer Joins to Define Join Indexes - Teradata Database

Teradata Database Design

Product
Teradata Database
Release Number
16.10
Release Date
June 2017
Content Type
User Guide
Publication ID
B035-1094-161K
Language
English (United States)

There are several benefits in defining non-aggregate join indexes with outer joins:

  • Unmatched rows are preserved.

    These rows allow the join index to satisfy queries with fewer join conditions than those used to generate the index.

  • Outer table row scans can provide the same performance benefits as a single-table join index.

    For example, the Optimizer can choose to scan the outer table rows of a join index to satisfy a query that only references the outer table provided that a join index scan would be more high-performing than scanning the base table or redistributing rows.

Redefined Join Index

The following example changes the previous join index example  (see Defining a Simple Join Index on a Binary Join Result) to use an Outer Join in the join index definition.

     CREATE JOIN INDEX OrdCustIdx AS
     SELECT (o_custkey,c_name)
            ,
            (o_status,o_date,o_comment)
     FROM orders LEFT JOIN customer ON o_custkey=c_custkey;

Materialized Join Index

The resulting join index rows would be the following (where the ? character indicates a null).

OrdCustIdx
Fixed Part Repeated Part
CustKey Name Status Date Comment
100 Robert S 2004-10-01 big order
S 2004-10-05 credit
101 Ann P 2004-10-08 discount
102 Don S 2004-10-01 rush order
D 2004-10-03 delayed
? ? U 2004-10-05 unknown customer

With the join index defined in this way, the following query could be resolved using just the join index, without having to scan the base tables.

     SELECT o_status, o_date, o_comment
     FROM orders;

In this particular case, it is more efficient to access the join index than it is to access the orders base table. This is true whenever the cost of scanning the join index is less than the cost of scanning the orders table. The Optimizer evaluates both access methods, choosing the more efficient, less costly of the two for its query plan.