15.00 - Examples That Obey the General Covering Rules for Extra Tables in the Join Index Definition - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Examples That Obey the General Covering Rules for Extra Tables in the Join Index Definition

The following set of base tables, join indexes, queries, and EXPLAIN reports demonstrate how the referential integrity relationships among the underlying base tables in a join index definition influence whether the Optimizer selects the index for queries that reference fewer base tables than are referenced by the join index.

    CREATE SET TABLE t1, NO FALLBACK, NO BEFORE JOURNAL, 
                         NO AFTER JOURNAL (
      x1 INTEGER NOT NULL,
      a1 INTEGER NOT NULL,
      b1 INTEGER NOT NULL,
      c1 INTEGER NOT NULL,
      d1 INTEGER NOT NULL,
      e1 INTEGER NOT NULL,
      f1 INTEGER NOT NULL,
      g1 INTEGER NOT NULL,
      h1 INTEGER NOT NULL,
      i1 INTEGER NOT NULL,
      j1 INTEGER NOT NULL,
      k1 INTEGER NOT NULL,
    CONSTRAINT ri1 FOREIGN KEY (a1, b1, c1) REFERENCES t2 (a2,b2,c2), 
    CONSTRAINT ri2 FOREIGN KEY (d1) REFERENCES t3(d3),
    CONSTRAINT ri3 FOREIGN KEY (e1,f1) REFERENCES t4 (e4,f4),
    CONSTRAINT ri4 FOREIGN KEY (g1,h1,i1,j1) REFERENCES t5(g5,h5,i5,j5),
    CONSTRAINT ri5 FOREIGN KEY (k1) REFERENCES t6(k6));
 
     CREATE SET TABLE t2, NO FALLBACK, NO BEFORE JOURNAL,
                          NO AFTER JOURNAL (
       a2 INTEGER NOT NULL,
       b2 INTEGER NOT NULL,
       c2 INTEGER NOT NULL,
       x2 INTEGER)
     UNIQUE PRIMARY INDEX(a2, b2, c2);
 
     CREATE SET TABLE t3, NO FALLBACK, NO BEFORE JOURNAL,
                          NO AFTER JOURNAL (
       d3 INTEGER NOT NULL,
       x3 INTEGER)
     UNIQUE PRIMARY INDEX(d3);
 
     CREATE SET TABLE t4, NO FALLBACK, NO BEFORE JOURNAL,
                          NO AFTER JOURNAL (
       e4 INTEGER NOT NULL,
       f4 INTEGER NOT NULL,
       x4 INTEGER)
     UNIQUE PRIMARY INDEX(e4, f4);
 
     CREATE SET TABLE t5, NO FALLBACK, NO BEFORE JOURNAL,
                          NO AFTER JOURNAL (
       g5 INTEGER NOT NULL,
       h5 INTEGER NOT NULL,
       i5 INTEGER NOT NULL,
       j5 INTEGER NOT NULL,
       x5 INTEGER)
     UNIQUE PRIMARY INDEX(g5, h5, i5, j5);
 
     CREATE SET TABLE t6, NO FALLBACK, NO BEFORE JOURNAL,
                          NO AFTER JOURNAL (
       k6 INTEGER not null,
       x6 INTEGER)
     UNIQUE PRIMARY INDEX(k6);

Example : All Outer Joins in Join Index Definition

The following join index definition left outer joins t1 to t3 on da=d3 and then left outer joins that result to t6 on k1=k6.

     CREATE JOIN INDEX jiout AS
      SELECT d1, d3, k1, k6, x1
      FROM t1 LEFT OUTER JOIN t3 ON d1=d3 
              LEFT OUTER JOIN t6 ON k1=k6;

You would expect the Optimizer to use jiout with the following query, because all the outer joins in the join index are inner joined to the query tables on unique columns (d1 and k1 are declared foreign keys in t1 and d3 and k6, the primary keys for t3 and t6, respectively, are declared as the unique primary index for those tables).

The bold EXPLAIN report text indicates that the Optimizer does use jiout in its query plan.

     EXPLAIN SELECT d1, SUM(x1)
             FROM t1, t3
             WHERE d1=d3
             GROUP BY 1;
 
     *** Help information returned. 17 rows. 
     *** Total elapsed time was 1 second.
 
Explanation
------------------------------------------------------------------------
 1) First, we lock a distinct HONG_JI."pseudo table" for read on a
    RowHash to prevent global deadlock for HONG_JI.jiout.
 2) Next, we lock HONG_JI.jiout for read.
 3) We do an all-AMPs SUM step to aggregate from HONG_JI.jiout by way
    of an all-rows scan with no residual conditions, and the grouping
    identifier in field 1.  Aggregate Intermediate Results are
    computed locally, then placed in Spool 3.  The size of Spool 3 is
    estimated with low confidence to be 1 row.  The estimated time for
    this step is 0.03 seconds.
 4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
    an all-rows scan into Spool 1 (group_amps), which is built locally
    on the AMPs.  The size of Spool 1 is estimated with low confidence
    to be 1 row.  The estimated time for this step is 0.04 seconds.
5)  Finally, we send out an END TRANSACTION step to all AMPs involved
    in processing the request.
   -> The contents of Spool 1 are sent back to the user as the result of
      statement 1.

Example : All Inner Joins Without Aggregation in Join Index Definition

The following simple join index definition specifies inner joins on tables t1, t3 and t6.

     CREATE JOIN INDEX jiin AS
      SELECT d1, d3, k1, k6, x1
      FROM t1, t3, t6
      WHERE d1=d3 
      AND   k1=k6;

You would expect the Optimizer to use jiin with the following query, and the bold EXPLAIN report text indicates that it does.

     EXPLAIN SELECT d1, SUM(x1)
             FROM t1, t3
             WHERE d1=d3
             GROUP BY 1;
 
     *** Help information returned. 17 rows. 
     *** Total elapsed time was 1 second.
   
   Explanation
   ---------------------------------------------------------------------------
     1) First, we lock a distinct HONG_JI."pseudo table" for read on a
        RowHash to prevent global deadlock for HONG_JI.jiin.
     2) Next, we lock HONG_JI.jiin for read.
     3) We do an all-AMPs SUM step to aggregate from HONG_JI.jiin by way
        of an all-rows scan with no residual conditions, and the grouping
        identifier in field 1.  Aggregate Intermediate Results are
        computed locally, then placed in Spool 3.  The size of Spool 3 is
        estimated with low confidence to be 1 row.  The estimated time for
        this step is 0.03 seconds.
     4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
        an all-rows scan into Spool 1 (group_amps), which is built locally
        on the AMPs.  The size of Spool 1 is estimated with low confidence
        to be 1 row.  The estimated time for this step is 0.04 seconds.
     5) Finally, we send out an END TRANSACTION step to all AMPs involved
        in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
        statement 1.

Example : All Inner Joins With Aggregation in Join Index Definition

The following aggregate join index definition specifies inner joins on tables t1, t2, and t4.

     CREATE JOIN INDEX jiin_aggr AS
      SELECT a1, e1, SUM(x1) AS total
      FROM t1, t2, t4
      WHERE a1=a2 
      AND   e1=e4 
      AND   a1>1
      GROUP BY 1, 2;

You would expect the Optimizer to use join index jiin_aggr in its plan for the following query because it has the same join term as the query.

The bold EXPLAIN report text indicates that the Optimizer does use jiin_aggr in its plan:

     EXPLAIN SELECT a1, e1, SUM(x1) AS total
             FROM t1, t2, t4
             WHERE a1=a2 
             AND e1=e4 
             AND a1>2
             GROUP BY 1, 2;
 
     *** Help information returned. 13 rows. 
     *** Total elapsed time was 1 second.
 
   Explanation
   ---------------------------------------------------------------------------
     1) First, we lock a distinct HONG_JI."pseudo table" for read on a
        RowHash to prevent global deadlock for HONG_JI.jiin_aggr.
     2) Next, we lock HONG_JI.jiin_aggr for read.
     3) We do an all-AMPs RETRIEVE step from HONG_JI.jiin_aggr by way of
        an all-rows scan with a condition of ("(HONG_JI.jiin_aggr.a1 > 2)
        AND (HONG_JI.jiin_aggr.a1 >= 3)") into Spool 1 (group_amps), which
        is built locally on the AMPs.  The size of Spool 1 is estimated
        with no confidence to be 1 row.  The estimated time for this step
        is 0.03 seconds.
     4) Finally, we send out an END TRANSACTION step to all AMPs involved
        in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
        statement 1.  The total estimated time is 0.03 seconds.

Example : All Inner Joins With Aggregation in Join Index Definition

You would not expect the Optimizer to use join index jiin_aggr (see “Example 3: All Inner Joins With Aggregation in Join Index Definition” on page 583) in its plan for the following query because the condition b1=b2 AND f1=f4 is not covered by the join index defined by jiin_aggr. As a result, the Optimizer specifies a full‑table scan to retrieve the specified rows.

The EXPLAIN report text indicates that the Optimizer does not choose jiin_aggr to cover the query:

     EXPLAIN SELECT a1, e1, SUM(x1) AS total
             FROM t1, t2, t4
             WHERE b1=b2 
             AND   f1=f4 
             AND   a1>2
             GROUP BY 1, 2;
 
   Explanation
   ---------------------------------------------------------------------
    1) First, we lock a distinct HONG_JI."pseudo table" for read on a
       RowHash to prevent global deadlock for HONG_JI.t4.
    2) Next, we lock a distinct HONG_JI."pseudo table" for read on a
       RowHash to prevent global deadlock for HONG_JI.t2.
    3) We lock a distinct HONG_JI."pseudo table" for read on a RowHash to
       prevent global deadlock for HONG_JI.t1.
    4) We lock HONG_JI.t4 for read, we lock HONG_JI.t2 for read, and we
       lock HONG_JI.t1 for read.
    5) We do an all-AMPs RETRIEVE step from HONG_JI.t1 by way of an
       all-rows scan with a condition of ("HONG_JI.t1.a1 > 2") into Spool
       4 (all_amps), which is duplicated on all AMPs.  The size of Spool
       4 is estimated with no confidence to be 2 rows.  The estimated
       time for this step is 0.03 seconds.
    6) We do an all-AMPs JOIN step from HONG_JI.t2 by way of an all-rows
       scan with no residual conditions, which is joined to Spool 4 (Last
       Use).  HONG_JI.t2 and Spool 4 are joined using a product join,
       with a join condition of ("b1 = HONG_JI.t2.b2").  The result goes
       into Spool 5 (all_amps), which is duplicated on all AMPs.  The
       size of Spool 5 is estimated with no confidence to be 3 rows.  The
       estimated time for this step is 0.04 seconds.
    7) We do an all-AMPs JOIN step from HONG_JI.t4 by way of an all-rows
       scan with no residual conditions, which is joined to Spool 5 (Last
       Use).  HONG_JI.t4 and Spool 5 are joined using a product join,
       with a join condition of ("f1 = HONG_JI.t4.f4").  The result goes
       into Spool 3 (all_amps), which is built locally on the AMPs.  The
       size of Spool 3 is estimated with no confidence to be 2 rows.  The
       estimated time for this step is 0.04 seconds.
    8) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
       way of an all-rows scan, and the grouping identifier in field 1.
       Aggregate Intermediate Results are computed globally, then placed
       in Spool 6.  The size of Spool 6 is estimated with no confidence
       to be 2 rows.  The estimated time for this step is 0.05 seconds.
    9) We do an all-AMPs RETRIEVE step from Spool 6 (Last Use) by way of
       an all-rows scan into Spool 1 (group_amps), which is built locally
       on the AMPs.  The size of Spool 1 is estimated with no confidence
       to be 2 rows.  The estimated time for this step is 0.04 seconds.
   10) Finally, we send out an END TRANSACTION step to all AMPs involved
       in processing the request.
    -> The contents of Spool 1 are sent back to the user as the result of
       statement 1.

Example : More Inner Joined Tables in Aggregate Join Index Definition Than in Query

The following aggregate join index definition specifies inner joins on tables t1, t3, and t6 using conditions that exploit a foreign key-primary key relationship between table t1 and tables t3 and t6, respectively.

     CREATE JOIN INDEX jiin_aggr AS
      SELECT d1, k1, SUM(x1) AS total
      FROM t1, t3, t6
      WHERE d1=d3 
      AND   k1=k6
      GROUP BY 1, 2;

You would expect the Optimizer to include join index jiin_aggr in its access plan for the following query even though jiin_aggr is defined with an inner joined table, t6, that the query does not reference. This is acceptable to the Optimizer because of the foreign key-primary key relationship between t1 and the extra table, t6, on columns k1 and k6,which have a foreign key‑primary key relationship and are explicitly defined as a foreign key and as the unique primary index for their respective tables.

The bold EXPLAIN report text indicates that the Optimizer does select jiin_aggr for the query plan:

     EXPLAIN SELECT d1, SUM(x1)
             FROM t1, t3
             WHERE d1=d3
             GROUP BY 1;
 
    *** Help information returned. 17 rows. 
    *** Total elapsed time was 1 second.
   
   Explanation
   ---------------------------------------------------------------------------
     1) First, we lock a distinct HONG_JI."pseudo table" for read on a
        RowHash to prevent global deadlock for HONG_JI.jiin_aggr.
     2) Next, we lock HONG_JI.jiin_aggr for read.
     3) We do an all-AMPs SUM step to aggregate from HONG_JI.jiin_aggr by
        way of an all-rows scan with no residual conditions, and the
        grouping identifier in field 1.  Aggregate Intermediate Results
        are computed locally, then placed in Spool 3.  The size of Spool 3
        is estimated with low confidence to be 1 row.  The estimated time
        for this step is 0.03 seconds.
     4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
        an all-rows scan into Spool 1 (group_amps), which is built locally
        on the AMPs.  The size of Spool 1 is estimated with low confidence
        to be 1 row.  The estimated time for this step is 0.04 seconds.
     5) Finally, we send out an END TRANSACTION step to all AMPs involved
        in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
        statement 1.

Example : Join Index Left Outer Joined on Six Tables

The following join index definition left outer joins table t1 with, in succession, tables t2, t3, t4, t5, and t6 on a series of equality conditions made on foreign key‑primary key relationships among the underlying base tables.

     CREATE JOIN INDEX jiout AS
      SELECT a1, b1, c1, c2, d1, d3, e1, e4, f1, g1, h1, i1, j1, j5, 
             k1, k6, x1 
      FROM t1 
      LEFT OUTER JOIN t2 ON a1=a2 AND b1=b2 AND c1=c2
      LEFT OUTER JOIN t3 ON d1=d3 
      LEFT OUTER JOIN t4 ON e1=e4 AND f1=f4 
      LEFT OUTER JOIN t5 ON g1=g5 AND h1=h5 AND i1=i5 AND j1=j5 
      LEFT OUTER JOIN t6 ON k1=k6;

Even though the following query references fewer tables than are defined in the join index, you would expect the Optimizer to include join index ji_out in its access plan because all the extra outer joins are defined on unique columns and the extra tables are the inner tables in the outer joins.

The bold EXPLAIN report text indicates that the Optimizer does select ji_out for the query plan:

     EXPLAIN SELECT a1, b1, c1, SUM(x1)
             FROM t1, t2
             WHERE a1=a2 
             AND   b1=b2 
             AND   c1=c2
             GROUP BY 1, 2, 3;
 
    *** Help information returned. 18 rows. 
    *** Total elapsed time was 1 second.
   
   Explanation
   ---------------------------------------------------------------------------
     1) First, we lock a distinct HONG_JI."pseudo table" for read on a
        RowHash to prevent global deadlock for HONG_JI.jiout.
     2) Next, we lock HONG_JI.jiout for read.
     3) We do an all-AMPs SUM step to aggregate from HONG_JI.jiout by way
        of an all-rows scan with no residual conditions, and the grouping
        identifier in field 1.  Aggregate Intermediate Results are
        computed locally, then placed in Spool 3.  The size of Spool 3 is
        estimated with high confidence to be 2 rows.  The estimated time
        for this step is 0.03 seconds.
     4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
        an all-rows scan into Spool 1 (group_amps), which is built locally
        on the AMPs.  The size of Spool 1 is estimated with high
        confidence to be 2 rows.  The estimated time for this step is 0.04
        seconds.
     5) Finally, we send out an END TRANSACTION step to all AMPs involved
        in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
        statement 1.

Example : Many More Tables Referenced by Join Index Definition Than Referenced by Query

The following join index definition specifies all inner joins on tables t1, t2, t3, t4, t5 and t6 and specifies equality conditions on all the foreign key‑primary key relationships among those tables.

     CREATE JOIN INDEX ji_in AS
      SELECT a1, b1, c1, c2, d1, d3, e1, e4, f1, g1, g5, h1, i1, j1, 
             k1, k6, x1 
      FROM t1, t2, t3, t4, t5, t6
      WHERE a1=a2 
      AND   b1=b2 
      AND   c1=c2 
      AND   d1=d3 
      AND   e1=e4 
      AND   f1=f4 
      AND   g1=g5 
      AND   h1=h5 
      AND   i1=i5 
      AND   j1=j5 
      AND   k1=k6;

Even though 6 tables are referenced in the join index definition, and all its join conditions are inner joins, you would expect the Optimizer to include join index ji_in in its query plan for the following query, which only references 2 of the 6 tables, because all the conditions in the join index definition are based on foreign key‑primary key relationships among the underlying base tables.

The bold EXPLAIN report text indicates that the Optimizer does select ji_in for the query plan:

     EXPLAIN SELECT a1, b1, c1, SUM(x1)
             FROM t1, t2
             WHERE a1=a2 
             AND   b1=b2 
             AND   c1=c2
             GROUP BY 1, 2, 3;
 
    *** Help information returned. 18 rows. 
    *** Total elapsed time was 1 second.
   
   Explanation
   ---------------------------------------------------------------------------
     1) First, we lock a distinct HONG_JI."pseudo table" for read on a
        RowHash to prevent global deadlock for HONG_JI.ji_in.
     2) Next, we lock HONG_JI.ji_in for read.
     3) We do an all-AMPs SUM step to aggregate from HONG_JI.ji_in by way
        of an all-rows scan with no residual conditions, and the grouping
        identifier in field 1.  Aggregate Intermediate Results are
        computed locally, then placed in Spool 3.  The size of Spool 3 is
        estimated with high confidence to be 2 rows.  The estimated time
        for this step is 0.03 seconds.
     4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
        an all-rows scan into Spool 1 (group_amps), which is built locally
        on the AMPs.  The size of Spool 1 is estimated with high
        confidence to be 2 rows.  The estimated time for this step is 0.04
        seconds.
     5) Finally, we send out an END TRANSACTION step to all AMPs involved
        in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
        statement 1.

Example : Using a Join Index That Has an Extra Inner Join In Its Definition

The following example illustrates how the Optimizer uses a join index with an extra inner join when the connections among the tables in its definition are appropriately defined.

Suppose you have the following table definitions:

     CREATE SET TABLE fact (
       f_d1 INTEGER NOT NULL,
       f_d2 INTEGER NOT NULL, 
     FOREIGN KEY (f_d1) REFERENCES WITH NO CHECK OPTION dim1 (d1),
     FOREIGN KEY (f_d2) REFERENCES WITH NO CHECK OPTION dim2 (d2))
     UNIQUE PRIMARY INDEX (f_d1,f_d2);
 
     CREATE SET TABLE dim1 (
       a1 INTEGER NOT NULL,
       d1 INTEGER NOT NULL, 
     FOREIGN KEY (a1) REFERENCES WITH NO CHECK OPTION dim1_1 (d11))
     UNIQUE PRIMARY INDEX (d1);
 
     CREATE SET TABLE dim2 (
       d1 INTEGER NOT NULL,
       d2 INTEGER NOT NULL)
     UNIQUE PRIMARY INDEX (d2);
 
     CREATE SET TABLE dim1_1 (
       d11 INTEGER NOT NULL,
       d22 INTEGER NOT NULL)
     UNIQUE PRIMARY INDEX (d11);

The following graphic sketches the dimensional relationships among these tables:

In the following join index definition, the fact table is joined with dimension tables dim1 and dim2 by foreign key‑primary key joins on fact.f_d1=dim1.d1 and fact.f_d2=dim2.d2. Dimension table dim1 is also joined with its dimension subtable dim1_1 by a foreign key‑primary key join on dim1.a1=dim1_1.d11. A query on the fact and dim2 tables uses the join index ji_all because of its use of foreign key‑primary key relationships among the tables:

     CREATE JOIN INDEX ji_all AS 
      SELECT (COUNT(*)(FLOAT)) AS countstar, dim1.d1, dim1_1.d11,
              dim2.d2, (SUM(fact.f_d1)(FLOAT)) AS sum_f1
      FROM fact, dim1, dim2, dim1_1 
      WHERE ((fact.f_d1 = dim1.d1 ) 
      AND   (fact.f_d2 = dim2.d2 )) 
      AND   (dim1.a1 = dim1_1.d11 )
      GROUP BY dim1.d1, dim1_1.d11, dim2.d2 
      PRIMARY INDEX (d1);

Note that the results of the aggregate operations COUNT and SUM are both typed as FLOAT (see “Restrictions on Join Index Aggregate Functions” on page 591).

As the bold text in the following EXPLAIN report indicates, the Optimizer uses the join index ji_all for its query plan in this situation because even though table fact is the parent table of both tables dim1 and dim2, those tables are joined with fact on foreign key columns in the join index definition. Similarly, dim1 is joined to dim1_1 in the join index definition on a foreign key column.

     EXPLAIN SELECT d2, SUM(f_d1) 
             FROM fact, dim2 
             WHERE f_d2=d2 
             GROUP BY 1;
 
    *** Help information returned. 18 rows. 
    *** Total elapsed time was 1 second.
   
   Explanation
   ---------------------------------------------------------------------------
     1) First, we lock a distinct HONG_JI."pseudo table" for
        read on a RowHash to prevent global deadlock for
        HONG_JI.ji_all.
     2) Next, we lock HONG_JI.ji_all for read.
     3) We do an all-AMPs SUM step to aggregate from
        HONG_JI.ji_all by way of an all-rows scan with no
        residual conditions, and the grouping identifier in field 1.
        Aggregate Intermediate Results are computed globally, then placed
        in Spool 3.  The size of Spool 3 is estimated with no confidence
        to be 3 rows.  The estimated time for this step is 0.08 seconds.
     4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
        an all-rows scan into Spool 1 (group_amps), which is built locally
        on the AMPs.  The size of Spool 1 is estimated with no confidence
        to be 3 rows.  The estimated time for this step is 0.07 seconds.
     5) Finally, we send out an END TRANSACTION step to all AMPs involved
        in processing the request.
     -> The contents of Spool 1 are sent back to the user as the result of
        statement 1.