17.10 - Defining a Simple Join Index on a Binary Join Result - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - Database Design

Advanced SQL Engine
Teradata Database
Release Number
July 2021
English (United States)
Last Update

Table Definitions

Suppose you define the following customer and orders tables.

CREATE TABLE customer (
  c_custkey    INTEGER NOT NULL,
  c_name       CHARACTER(26) NOT NULL,
  c_address    VARCHAR(41),
  c_nationkey  INTEGER,
  c_phone      CHARACTER(16),
  c_acctbal    DECIMAL(13,2),
  c_mktsegment CHARACTER(21),
  c_comment    VARCHAR(127))
PRIMARY INDEX(c_custkey);

  o_orderkey      INTEGER NOT NULL,
  o_date          DATE FORMAT ‘yyyy-mm-dd’,
  o_status        CHARACTER(1),
  o_custkey       INTEGER,
  o_totalprice    DECIMAL(13,2),
  o_orderpriority CHARACTER(21),
  o_clerk         CHARACTER(16),
  o_shippriority  INTEGER,
  o_comment       VARCHAR(79))

Example Query Request

Consider the following SELECT request against these tables.

SELECT o_custkey, c_name, o_status, o_date, o_comment
FROM orders, customer
WHERE o_custkey=c_custkey;

The next few topics examine the query plan for this SELECT request: first without and then with a join index.

Query Plan: No Join Index Defined

Without a defined join index, the execution plan for this query would typically redistribute the orders table into a spool, sort the spool on o_custkey, and then perform a merge join between the spool and the customer table.

Query Plan: Join Index Defined

Now consider the execution plan for this same query when the following join index has been defined:

SELECT (o_custkey, c_name), (o_status, o_date, o_comment)
FROM orders, customer
WHERE o_custkey=c_custkey;

With this join index defined, the execution plan for the query specifies a simple scan of the join index without accessing any of the underlying base tables and without having to join them on the predicate WHERE o_custkey = c_custkey.

In the join index defined for this example, (o_custkey, c_name) is the specified fixed part of the index and (o_status, o_date, o_comment) is the repeated portion. Therefore, assume the following specimen base table entries (where the ? character indicates a null).

CustKey Name Address
100 Robert San Diego
101 Ann Palo Alto
102 Don El Segundo
OrderKey Date Status CustKey Comment
5000 2004-10-01 S 102 rush order
5001 2004-10-01 S 100 big order
5002 2004-10-03 D 102 delayed
5003 2004-10-05 U ? unknown customer
5004 2004-10-05 S 100 credit

You cannot collect statistics on a complex expression from a base table. If your applications frequently run queries that specify complex expressions in their predicates, you should consider creating a single-table join index that specifies a matching complex expression in its select list or column list, respectively. When Vantage creates the index, it transforms the complex expression into a simple index column on which you can collect statistics.

If the complex expression specified by the index is a term that matches a predicate condition for a query made against the base table the index is defined on, statistics collected on the index expression can be mapped to the base table so the Optimizer can use them to make more accurate single-table cardinality estimates.

Materialized Join Index

The materialized logical join index rows are the following:

Fixed Part Repeated Part
CustKey Name Status Date Comment
100 Robert S 2004-10-01 big order
S 2004-10-05 credit
101 Ann P 2004-10-08 discount
102 Don S 2004-10-01 rush order
D 2004-10-03 delayed

Note that the information for the null customer is not included in this join index because it was defined using an inner join.

Join indexes are not limited to binary joins: like any other join, they can be defined on joins involving as many as 128 tables.