17.10 - Example: Creating and Using a Simple Row-Compressed Join Index - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL Data Definition Language Syntax and Examples

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B035-1144-171K
Language
English (United States)

This example creates a simple row-compressed join index based on the following customer and orders tables.

     CREATE TABLE customer (
       c_custkey    INTEGER,
       c_name       CHARACTER(26) not null,
       c_address    VARCHAR(41),
       c_nationkey  INTEGER,
       c_phone      CHARACTER(16),
       c_acctbal    DECIMAL(13,2),
       c_mktsegment CHARACTER(21),
       c_comment    VARCHAR(127))
     PRIMARY INDEX( c_custkey );
     CREATE TABLE orders (
       o_orderkey      INTEGER,
       o_date          DATE FORMAT 'yyyy-mm-dd',
       o_status        CHARACTER(1),       
       o_custkey       INTEGER,
       o_totalprice    DECIMAL(13,2),
       o_orderpriority CHARACTER(21),
       o_clerk         CHARACTER(16),
       o_shippriority  INTEGER,
       o_comment       VARCHAR(79))
     UNIQUE PRIMARY INDEX(o_orderkey);

Assume that you frequently submit this join query on customer and orders.

     SELECT o_custkey, c_name, o_status, o_date, o_comment
     FROM orders, customer 
     WHERE o_custkey = c_custkey;

Without a join index, a typical execution plan for this query would redistribute the orders table into a spool file, sort the spool on o_custkey, and then doing a merge join between the spool file and the customer table.

If a join index like ord_cust_idx were defined on customer and orders, the resulting execution plan could just scan the join index.

The fixed column set in the join index definition is o_custkey and c_name (corresponding results rows highlighted in orange) , and the repeating column set in the join index definition is o_status, o_date, and o_comment, (corresponding results rows highlighted in blue).

     CREATE JOIN INDEX ord_cust_idx AS
       SELECT (o_custkey,  c_name 
), (
 o_status, o_date, o_comment 
) 
       FROM orders, customer 
       WHERE o_custkey = c_custkey;

The way ord_cust_idx is defined makes it a row-compressed join index, with the fixed column set being o_custkey and c_name and the repeating column set being o_status, o_date, and o_comment.

Assume that the base tables customer and orders contain the following data.

customer    
c_custkey c_name c_address
100 Louis Rancho Bernardo
101 Pekka Helsinki
102 Felipe San Juan
orders        
o_orderkey o_date o_status o_custkey o_comment
5000 2011-10-01 S 102 rush order
5001 2011-10-01 S 100 big order
5002 2011-10-03 D 102 delayed
5003 2011-10-05 U ? unknown customer
5004 2011-10-05 S 100 credit
5005 2011-10-08 P 101 discount

The corresponding rows in the row-compressed join index ord_cust_idx are the following.

ord_cust_idx  
fixed columns repeated columns
100  Louis S  2011-10-01  big order  2011-10-05  credit
101  Pekka P  2011-10-08  discount
102  Felipe S  2011-10-03  rush order  D  2011-10-03  delayed

Notice that the fixed set columns are contained in the first column of ord_cust_idx and the repeated set columns are contained in the second column of ord_cust_idx , just as the join index definition had defined.

A drawback of join results generated from inner joins is that unmatched rows are not preserved. In the first part of this example, information about order 5003 is not stored in ord_cust_idx , thus limiting the types of queries that could be resolved using the index. To make join indexes applicable to as many queries as possible, you should define your join indexes using outer joins whenever possible. This enables a join index to satisfy queries with fewer join conditions than those used to generate the index.

Suppose you changed the definition of ord_cust_idx to use an outer join as the following CREATE JOIN INDEX request does.

     CREATE JOIN INDEX ord_cust_idx AS
     SELECT (o_custkey, c_name), (o_status, o_date, o_comment)
     FROM orders LEFT OUTER JOIN customer ON o_custkey = c_custkey;

The resulting rows in ord_cust_idx are the following:

ord_cust_idx  
fixed columns repeated columns
100 Louis S  2011-10-01  big order  S  2011-10-05  credit
101 Pekka P  2011-10-08  discount
102 Felipe S  2011-10-03  rush order  D  2011-10-03  delayed
 ?   ? U  2011-10-05  unknown customer

By redefining ord_cust_idx using a left outer join, the following query could also be resolved using only the join index.

     SELECT o_status, o_date, o_comment
     FROM orders;