Join Index Examples | Join Index Optimizations | VantageCloud Lake - Join Indexes - Teradata VantageCloud Lake

Lake - Working with SQL

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
jbe1714339405530.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
jbe1714339405530

The following sections demonstrate the performance optimization achieved on table selects, deletes, inserts, and updates resulting from the use of join indexes.

For additional information about join indexes, see When Join Indexes Are Useful and CREATE JOIN INDEX.

A Simple Join Query

The following is an example of a simple join query:

     EXPLAIN
     SELECT o_orderdate, o_custkey, l_partkey, l_quantity,
            1_extendedprice
     FROM lineitem, ordertbl
     WHERE l_orderkey=o_orderkey;

Part of the EXPLAIN output follows.

  4) We do an all-AMPs JOIN step in TD_MAP1 from DB1.ordertbl by way
     of a RowHash match scan with no residual conditions, which is
     joined to DB1.lineitem by way of a RowHash match scan with no
     residual conditions.  DB1.ordertbl and DB1.lineitem are joined
     using a sliding-window merge join, with a join condition of (
     "DB1.lineitem.l_orderkey = DB1.ordertbl.o_orderkey").  The
     result goes into Spool 1 (group_amps), which is built locally on
     the AMPs.  The size of Spool 1 is estimated with low confidence to
     be 11 rows (869 bytes).  The estimated time for this step is 0.21
     seconds.

A Search on a Join Index

The following is an example of a search condition on the join index:

     EXPLAIN
     SELECT o_orderdate, o_custkey, l_partkey, l_quantity,
            l_extendedprice
     FROM lineitem, ordertbl
     WHERE l_orderkey=o_orderkey
     AND   o_orderdate>'1997-11-01';

Part of the EXPLAIN output follows.

  4) We do an all-AMPs JOIN step in TD_MAP1 from 500 partitions of
     DB1.ordertbl by way of a RowHash match scan with a condition of (
     "DB1.ordertbl.o_orderdate > DATE '1997-11-01'"), which is joined
     to DB1.lineitem by way of a RowHash match scan with no residual
     conditions.  DB1.ordertbl and DB1.lineitem are joined using a
     sliding-window merge join, with a join condition of (
     "DB1.lineitem.l_orderkey = DB1.ordertbl.o_orderkey").  The
     result goes into Spool 1 (group_amps), which is built locally on
     the AMPs.  The size of Spool 1 is estimated with low confidence to
     be 2 rows (158 bytes).  The estimated time for this step is 0.21
     seconds.

Aggregation on a Join Index

The following is an example of an aggregation on a join index.

     EXPLAIN
     SELECT l_partkey, AVG(l_quantity), AVG(l_extendedprice)
     FROM lineitem, ordertbl
     WHERE l_orderkey=o_orderkey
     AND   o_orderdate >'1997-11-01'
     GROUP BY l_partkey;

Part of the EXPLAIN output follows.

  4) We do an all-AMPs JOIN step in TD_MAP1 from DB1.ordertbl by way
     of a RowHash match scan with no residual conditions, which is
     joined to DB1.lineitem by way of a RowHash match scan with no
     residual conditions.  DB1.ordertbl and DB1.lineitem are joined
     using a sliding-window merge join, with a join condition of (
     "DB1.lineitem.l_orderkey = DB1.ordertbl.o_orderkey").  The
     result goes into Spool 2 (all_amps), which is built locally on the
     AMPs.  The size of Spool 2 is estimated with low confidence to be
     11 rows (341 bytes).  The estimated time for this step is 0.21
     seconds.
  5) We do an all-AMPs SUM step in TD_Map1 to aggregate from Spool 2
     (Last Use) by way of an all-rows scan, grouping by field1 (
     DB1.lineitem.l_partkey).  Aggregate Intermediate Results are
     computed globally, then placed in Spool 4 in TD_Map1.  The size of
     Spool 4 is estimated with no confidence to be 9 rows (369 bytes).
     The estimated time for this step is 0.23 seconds.
  6) We do an all-AMPs RETRIEVE step in TD_Map1 from Spool 4 (Last Use)
     by way of an all-rows scan into Spool 1 (group_amps), which is
     built locally on the AMPs.  The size of Spool 1 is estimated with
     no confidence to be 9 rows (522 bytes).  The estimated time for
     this step is 0.16 seconds.

Join Index Used to Join with Another Base Table

The following is an example of a join index used to join with another base table:

     EXPLAIN
     SELECT o_orderdate, c_name, c_phone, l_partkey,l_quantity,
            l_extendedprice
     FROM lineitem, ordertbl, customer
     WHERE l_orderkey=o_orderkey
     AND   o_custkey=c_custkey;

Part of the EXPLAIN output follows.

  5) We do an all-AMPs JOIN step in TD_MAP1 from DB1.ordertbl by way
     of a RowHash match scan with no residual conditions, which is
     joined to DB1.lineitem by way of a RowHash match scan with no
     residual conditions.  DB1.ordertbl and DB1.lineitem are joined
     using a sliding-window merge join, with a join condition of (
     "DB1.lineitem.l_orderkey = DB1.ordertbl.o_orderkey").  The
     result goes into Spool 2 (all_amps), which is redistributed by the
     hash code of (DB1.ordertbl.o_custkey) to all AMPs in TD_Map1.
     Then we do a SORT to order Spool 2 by row hash.  The size of Spool
     2 is estimated with low confidence to be 11 rows (407 bytes).  The
     estimated time for this step is 0.11 seconds.
  6) We do an all-AMPs JOIN step in TD_MAP1 from DB1.customer by way
     of a RowHash match scan with a condition of (
     "(DB1.customer.c_custkey <= 49999) AND (DB1.customer.c_custkey
     >= 0)"), which is joined to Spool 2 (Last Use) by way of a RowHash
     match scan.  DB1.customer and Spool 2 are joined using a merge
     join, with a join condition of ("o_custkey =
     DB1.customer.c_custkey").  The result goes into Spool 1
     (group_amps), which is built locally on the AMPs.  The size of
     Spool 1 is estimated with low confidence to be 11 rows (1,188
     bytes).  The estimated time for this step is 0.21 seconds.

Join Index Used to Resolve Single-Table Query

The following is an example of a join index used to resolve single table query:

     EXPLAIN
     SELECT l_orderkey, l_partkey, l_quantity, l_extendedprice
     FROM lineitem
     WHERE l_partkey = 1001;

Part of the EXPLAIN output follows.

  3) We do an all-AMPs RETRIEVE step in TD_MAP1 from DB1.ORDERJOINLINE
     by way of an all-rows scan with a condition of (
     "DB1.ORDERJOINLINE.l_partkey = 1001") into Spool 1 (group_amps),
     which is built locally on the AMPs.  The size of Spool 1 is
     estimated with no confidence to be 1 row (69 bytes).  The
     estimated time for this step is 0.15 seconds.

Creating and Using a Secondary Index on a Join Index

The following is an example creating and using secondary index on a join index:

     CREATE INDEX shipidx(l_shipdate) ON OrderJoinLine;
     ***Index has been created.
     ***Total elapsed time was 5 seconds.
     EXPLAIN
     SELECT o_orderdate, o_custkey, l_partkey, l_quantity,  
            l_extendedprice
     FROM lineitem, ordertbl
     WHERE l_orderkey=o_orderkey
     AND   l_shipdate='1997-09-18';

Part of the EXPLAIN output follows.

  3) We do an all-AMPs RETRIEVE step in TD_MAP1 from DB1.ORDERJOINLINE
     by way of an all-rows scan with a condition of ("(NOT
     (DB1.ORDERJOINLINE.o_orderdate IS NULL )) AND
     (DB1.ORDERJOINLINE.l_shipdate = DATE '1997-09-18')") into Spool 1
     (group_amps), which is built locally on the AMPs.  The size of
     Spool 1 is estimated with no confidence to be 8 rows (632 bytes).
     The estimated time for this step is 0.15 seconds.

Join Index Left Outer Joined on 6 Tables

The following join index definition left outer joins table t1 with, in succession, tables t2, t3, t4, t5, and t6 on a series of equality conditions made on foreign key-primary key relationships among the underlying base tables:

     CREATE JOIN INDEX jiout AS
     SELECT a1,b1,c1,c2,d1,d3,e1,e4,f1,g1,h1,i1,j1,j5,k1,k6, x1
     FROM t1
     LEFT OUTER JOIN t2 ON  a1=a2
                        AND b1=b2
                        AND c1=c2
     LEFT OUTER JOIN t3 ON  d1=d3
     LEFT OUTER JOIN t4 ON  e1=e4
                        AND f1=f4
     LEFT OUTER JOIN t5 ON  g1=g5
                        AND h1=h5
                        AND i1=i5
                        AND j1=j5
     LEFT OUTER JOIN t6 ON  k1=k6;

Although the following query references fewer tables than the join index defines, you expect the Optimizer to include join index ji_out in its access plan, because the extra outer joins are defined on unique columns and the extra tables are the inner tables in the outer joins.

A covering join index whose definition includes one or more tables that is not specified in the covered query is called a broad join index. A wide range of queries can use a broad join index, especially when there are foreign key-primary key relationships defined between the fact table and the dimension tables that enable the index to be used to cover queries over a subset of dimension tables.

The bold EXPLAIN report text indicates that the Optimizer does select ji_out for the query plan. This is an example of a broad join index being used to cover a query.

     EXPLAIN SELECT a1, b1, c1, SUM(x1)
     FROM t1,t2
     WHERE a1=a2
     AND   b1=b2
     AND   c1=c2
     GROUP BY 1, 2, 3;

Part of the EXPLAIN output follows.

      3) We do an all-AMPs SUM step to aggregate from  HONG_JI.jiout  
         by way of an all-rows scan with no residual conditions, and 
         the grouping identifier in field 1. Aggregate Intermediate 
         Results are computed locally, then placed in Spool 3. The 
         size of Spool 3 is estimated with high confidence to be 2
         rows. The estimated time for this step is 0.03 seconds.
      4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by 
         way of an all-rows scan into Spool 1 (group_amps), which is 
         built locally on the AMPs. The size of Spool 1 is estimated 
         with high confidence to be 2 rows. The estimated time for 
         this step is 0.04 seconds.

More Tables Referenced by a Join Index Definition than Referenced by Query

The following join index definition specifies all inner joins on tables t1, t2, t3, t4, t5and t6 and specifies equality conditions on all the foreign key-primary key relationships among those tables:

     CREATE JOIN INDEX ji_in AS
     SELECT a1,b1,c1,c2,d1,d3,e1,e4,f1,g1,g5,h1,i1,j1,k1,k6, x1
     FROM t1,t2,t3,t4,t5,t6
     WHERE a1=a2
     AND   b1=b2
     AND   c1=c2
     AND   d1=d3
     AND   e1=e4
     AND   f1=f4
     AND   g1=g5
     AND   h1=h5
     AND   i1=i5
     AND   j1=j5
     AND   k1=k6;

Even though 6 tables are referenced in the join index definition, and all its join conditions are inner joins, you expect the Optimizer to include join index ji_inin its query plan for the following query, which only references 2 of the 6 tables, because all the conditions in the join index definition are based on foreign key-primary key relationships among the underlying base tables. This is also an example of a broad join index being used to cover a query.

The bold EXPLAIN report text indicates that the Optimizer does select ji_infor the query plan.

     EXPLAIN SELECT a1,b1,c1,SUM(x1)
     FROM t1,t2
     WHERE a1=a2
     AND   b1=b2
     AND   c1=c2
     GROUP BY 1, 2, 3;

Part of the EXPLAIN output follows.

      3) We do an all-AMPs SUM step to aggregate from  HONG_JI.ji_in  
         by way of an all-rows scan with no residual conditions, 
         and the grouping identifier in field 1. Aggregate Intermediate
         Results are computed locally, then placed in Spool 3. The size
         of Spool 3 is estimated with high confidence to be 2 rows.
         The estimated time for this step is 0.03 seconds.
      4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by 
         way of an all-rows scan into Spool 1 (group_amps), which
         is built locally on the AMPs. The size of Spool 1 is estimated
         with high confidence to be 2 rows. The estimated time for
         this step is 0.04 seconds.

Using a Join Index Defined with a Multiway Join Result

The following is an example defining and using a join index defined with a multiway join result:

     CREATE JOIN INDEX CustOrderJoinLine
     AS SELECT (1_orderkey,o_orderdate,c_nationkey,o_totalprice),
               (1_partkey,1_quantity,1_extendedprice,1_shipdate)
     FROM (lineitem
     LEFT OUTER JOIN ordertbl ON 1_orderkey=o_orderkey)
     INNER JOIN customer ON o_custkey=c_custkey
     PRIMARY INDEX (1_orderkey);
     *** Index has been created.
     *** Total elapsed time was 20 seconds.
     EXPLAIN
     SELECT (1_orderkey,o_orderdate,c_nationkey,o_totalprice),
            (1_partkey,1_quantity,1_extendedprice,1_shipdate)
     FROM lineitem,ordertbl,customer
     WHERE 1_orderkey=o_custkey
     AND o_custkey=c_custkey
     AND c_nationkey=10;

Part of the EXPLAIN output follows.

   3) We do an all-AMPs RETRIEVE step from join index table
      df2.CustOrderJoinLine by way of an all-rows scan with a
      condition of (“df2.CustOrderJoinLine.c_nationkey=10”) into
      Spool 1, which is built locally on the AMPs. The input table is
      not cached in memory, but it is eligible for synchronized
      scanning. The result spool file is not cached in memory.
      The size of Spool 1 is estimated to be 200 rows. The estimated 
      time for this step is 3 minutes and 57 seconds.