16.10 - Inclusion and Exclusion Product Joins With Dynamic Row Partition Elimination - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
16.10
created_date
June 2017
category
Programming Reference
User Guide
featnum
B035-1142-161K

About Product Joins With Dynamic Row Partition Elimination

When the Optimizer determines that the least costly method for making a join between a row-partitioned table and another relation on either an IN or NOT IN condition, it can choose to use either an inclusion or exclusion product join with DPE. Whether it uses an inclusion or exclusion product join with DPE depends on the condition on which the relations are joined.

IF the join condition is on this term … THEN the Optimizer chooses this product join …
IN Inclusive with DPE.
NOT IN Exclusive with DPE.

The performance enhancement seen for inclusion and exclusion DPE product joins over standard inclusion and exclusion product joins occurs for two reasons.

  • The number of comparisons between left relation and right relation rows is reduced, which in turn reduces the I/O on the left and right relations and the CPU costs to perform the join.
  • The following is true for the case when one relation in a join is a table and the other is a spool:

    When the number of values in the spool is small, resulting in fewer populated spool partitions than populated row partitions from the row-partitioned table, those table partitions that are not in spool need not be either read or joined, reducing both the I/O and CPU costs of the join.

Because the performance improvement gained from DPE joins is highly dependent on the number of rows in each row partition (and so the number of row comparisons that must be made), the Optimizer does not choose a DPE join unless single-column PARTITION statistics have been collected on the outer (row-partitioned) table in the join. PARTITION statistics improve the ability of the Optimizer to estimate the number of comparisons that must be made. The Optimizer does not choose a DPE join when there is a no confidence estimate on the cardinality of the subquery spool.

For example, suppose you create the following tables and submit the given SELECT request against those tables:

     CREATE SET TABLE MWS.t1, FALLBACK, NO BEFORE JOURNAL,
     NO AFTER JOURNAL, CHECKSUM = DEFAULT (
       a INTEGER,
       b INTEGER)
     PRIMARY INDEX (a);

     CREATE SET TABLE MWS.t2, FALLBACK, NO BEFORE JOURNAL,
     NO AFTER JOURNAL, CHECKSUM = DEFAULT (
       a INTEGER,
       b INTEGER,
       c INTEGER,
       d INTEGER)
     PRIMARY INDEX (a)
     PARTITION BY (RANGE_N(b BETWEEN 1 /* Partitioning level one. */
                             AND   100
                             EACH    7,
                   NO RANGE OR UNKNOWN),
                   RANGE_N(c BETWEEN 1 /* Partitioning level two. */
                             AND   100
                             EACH   10,
                   NO RANGE OR UNKNOWN),
                   RANGE_N(d BETWEEN 1 /* Partitioning level three. */
                             AND   100
                             EACH   20,
                   NO RANGE OR UNKNOWN));
     SELECT *
     FROM t2
     WHERE (b,c) IN (SELECT a,b
                     FROM t1);

For this request to be eligible for an inclusion or exclusion product join with DPE, you must collect the following statistics:

     COLLECT STATISTICS t1 COLUMN(a);
     COLLECT STATISTICS t2 COLUMN(PARTITION);

The following rules apply to collecting these statistics:

To enable this type of join… You must collect statistics on the …
Product (all kinds) with DPE
  • join column set of the table that must be duplicated to enable an inclusion or exclusion product join with DPE.
  • any right table predicate columns that are applied to the table that must be duplicated.
Inclusion or exclusion product with DPE system-derived PARTITION column set of the row-partitioned table.

Another name for the type of join being done with inclusion and exclusion product joins is semijoin. As a result, the inclusion and exclusion product joins with DPE can also be referred to by the name product semijoin with DPE.

When a product semijoin with DPE is applied by the Optimizer, the EXPLAIN text for the query indicates enhanced by dynamic row partition elimination for the corresponding AMP step. In the following example, this text is highlighted in boldface type:

     EXPLAIN SELECT COUNT(*)
             FROM t55
             WHERE (b,c) NOT IN (SELECT 1, 1);
     Explanation
     ------------------------------------------------------------------
  1) First, we lock MWS.t55 for read on a reserved RowHash in all partitions
     to prevent global deadlock.
  2) Next, we lock MWS.t55 for read.
  3) We do an INSERT into Spool 5.
  4) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by way of
     an all-rows scan into Spool 4 (group_amps), which is redistributed
     by the hash code of (1, 1) to all AMPs. The size of Spool 4 is
     estimated with high confidence to be 1 row (22 bytes). The
     estimated time for this step is 0.03 seconds.
  5) We do a group-AMP RETRIEVE step from Spool 4 (Last Use) by way of
     an all-rows scan into Spool 7 (all_amps), which is duplicated on
     all AMPs. Then we do a SORT to partition by rowkey. The size of
     Spool 7 is estimated with high confidence to be 2 rows (30 bytes).
     The estimated time for this step is 0.03 seconds.
  6) We do an all-AMPs JOIN step from MWS.t55 by way of an all-rows
     scan with no residual conditions, which is joined to Spool 7 (Last
     Use) by way of an all-rows scan. MWS.t55 and Spool 7 are joined
     using an exclusion product join, with a join condition of (
     "(MWS.t55.b = Field_2) AND (MWS.t55.c = Field_3)")  enhanced by 
     dynamic row partition elimination. The input table MWS.t55
     will not be cached in memory, but it is eligible for synchronized
     scanning. The result goes into Spool 3 (all_amps), which is built
     locally on the AMPs. The result spool file will not be cached
     in memory. The size of Spool 3 is estimated with index join
     confidence to be 2,370,668 rows (35,560,020 bytes). The estimated
     time for this step is 39.38 seconds.
  7) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
     way of an all-rows scan. Aggregate Intermediate Results are
     computed globally, then placed in Spool 8. The size of Spool 8 is
     estimated with high confidence to be 1 row (23 bytes). The
     estimated time for this step is 3.06 seconds.
  8) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
     an all-rows scan into Spool 1 (group_amps), which is built locally
     on the AMPs. The size of Spool 1 is estimated with high
     confidence to be 1 row (25 bytes). The estimated time for this
     step is 0.02 seconds.
  9) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.

The Optimizer applies an inclusion or exclusion product join with DPE when there are equality join terms between the partitioning columns at one or more partitioning levels of a row-partitioned table, which is the outer table of a semijoin, and another relation. Inclusion and exclusion product joins with DPE enhance the performance of such queries for those cases where the row-partitioned table is large and a costly sort in preparation for a regular inclusion or exclusion merge join can be avoided. Such queries further benefit when the number of rows for the join column selected by the subquery is small because in that case, the number of values in the spool is small, resulting in fewer populated spool row partitions than populated table partitions of the row-partitioned table. As a result, the table row partitions that are not in spool neither need to be read nor joined, reducing both the I/O and CPU costs of the join.

For example, assume you have the following table definitions and cardinalities:

     CREATE SET TABLE t1 (
       a INTEGER,
       b INTEGER,
       c INTEGER)
     PRIMARY INDEX ( a );
     CREATE SET TABLE t8 (
       a INTEGER,
       b INTEGER,
       c INTEGER)
     PRIMARY INDEX (a)
     PARTITION BY (RANGE_N(c BETWEEN 1
                             AND  1200
                             EACH   30,
                   NO RANGE OR UNKNOWN),
                   RANGE_N(b BETWEEN 1
                             AND 11000
                             EACH    7,
                   NO RANGE OR UNKNOWN));
Table Name Table Cardinality (rows)
t1                                   1,000
t8                            9,000,000

For the following query, the number of rows returned by the subquery is small, so it is possible to eliminate many row partitions in t8 dynamically, making a product join cost effective. Additionally, it is not necessary to sort the t8 rows to make the join.

     SELECT COUNT(*)
     FROM t8
     WHERE (b,c) IN (SELECT a,b
                     FROM t1
                     WHERE c = 1);
          *** Query completed. One row found. One column returned.
           *** Total elapsed time was 1 second.
        Count(*)
     -----------
              65

On the other hand, if the Optimizer were to use a merge join instead of an inclusion product join with DPE, then the t8 rows would have to be sorted, which takes the majority of the processing time, and the query takes 57 seconds to complete rather than 1 second.

     SELECT COUNT(*)
     FROM t8
     WHERE (b,c) IN (SELECT a,b
                     FROM t1
                     WHERE c = 1);
     *** Query completed. One row found. One column returned.
      *** Total elapsed time was 57 seconds.
        Count(*)
     -----------
              65

Be aware that the performance enhancement achieved with inclusion and exclusion product joins with DPE is not always this great, and the degree of the enhancement depends heavily on the following factors:

  • The number of rows per row partition of the row-partitioned table
  • The number of columns projected by the subquery

Costing takes into account those cases for which the row-partitioned table has a highly variable number of rows per row partition, and avoids applying an inclusion or exclusion product join with DPE for those tables.

Inclusion Product Join With Dynamic Row Partition Elimination

Inclusion product join with dynamic row partition elimination (DPE), which is designed for use when the outer relation in a join based on an IN term is a row-partitioned table, differ from inclusion product joins without DPE in that the DPE versions of the join are driven from the inner table instead of the outer, or row-partitioned, table, while the non-DPE versions are driven from the outer table.

For example, suppose you have the following table definitions and query against those tables:

     CREATE SET TABLE MWS.t1, FALLBACK, NO BEFORE JOURNAL,
       NO AFTER JOURNAL,CHECKSUM = DEFAULT (
       a INTEGER,
       b INTEGER)
     PRIMARY INDEX (a);

     CREATE SET TABLE MWS.t2, FALLBACK, NO BEFORE JOURNAL,
       NO AFTER JOURNAL, CHECKSUM = DEFAULT (
       a INTEGER,
       b INTEGER)
     PRIMARY INDEX (a)
     PARTITION BY (RANGE_N(a BETWEEN  1 /* Partitioning level 1 */
                             AND  60000
                             EACH 60000,
                   NO RANGE OR UNKNOWN),
                   RANGE_N(b BETWEEN -3 /* Partitioning level 2 */
                             AND  31580
                             EACH     1,
                   NO RANGE, UNKNOWN) );
     SELECT *
     FROM t2
     WHERE (b) IN (SELECT a
                   FROM t1);

If the Optimizer were to choose a standard inclusion product join to execute this request, the system would perform the join as follows:

  1. Sort t1 to remove duplicate values of column a.
  2. Duplicate the sorted values of t1 in a spool.
  3. Read the first row in t2.

    If there are no more rows to read, the inclusion product join process is done.

  4. Search for a match based on the connecting term t2.b=t1.a until a match is found or all rows in spool have been read.
  5. Go to stage 3 and read the next row in t2.

Teradata Database reads all rows in t2 sequentially and joins them with the spool in this manner.

For an inclusion product join with DPE, the system does not read t2 sequentially, nor does it compare each row in t2 with all rows in the spool.

When a bind term is specified on all, or on a subset of, the partitioning levels of a row-partitioned table, the system can use the information to build a spool having the same row partitioning as the row-partitioned table. It can then join only the equivalent partitioning ranges between the base table row partitions and the spool row partitions.

The setup for the inclusion product join with DPE is the same as that for a standard inclusion product join, except that when the spool is duplicated, it is built with the same partitioning as is defined for the base table, and it is then sorted by partition number. In the previous example, the spool would be row-partitioned using the partitioning expression that is defined for row-partitioned table t2:

     PARTITION BY (RANGE_N(a BETWEEN -3
                             AND  31580
                             EACH     1,
                   NO RANGE, UNKNOWN) );

Note that the row partitioning is done on column a of the t1 spool because the bind term t2.b=t1.a  links partitioning on t2.b with t1.a.

The system performs the inclusion product join with DPE as follows:

  1. Sort t1 to remove duplicate values of column a.
  2. Duplicate the sorted values of t1 and row-partition them in a spool by the partitioning ranges defined for t1.
  3. Read the first row in the first row partition of the spool.
  4. Build the list of row partitions dynamically in the row-partitioned table to which the current spool partition is to join.
  5. Read the first row in the first DPE row partition of the row-partitioned table.
  6. Compare the row with all rows in the current spool row partition until either a match is found or there are no more rows in the spool row partition.
    IF a match is … THEN …
    found return the row-partitioned table row.
    not found go to stage 7.
  7. Read the next row of the row-partitioned table in the participating DPE partitions.
    IF there are … THEN go to stage …
    no more rows 6.
    more rows 4.
  8. Read the first row in the spool with a partition number greater than the current spool row partition.
    IF there … THEN …
    are no such rows the join is complete.
    is such a row Go to stage 2.

Exclusion Product Join With Dynamic Row Partition Elimination

The algorithm for an exclusion product join with DPE is similar the algorithm for an inclusion product join with DPE, but with a restriction and with some other differences.

The restriction is that exclusion product join with DPE is only enabled with row-partitioned tables where the partitioning expression at each partitioning level consists solely of a RANGE_N function, and the test expression is a simple column. The reason for this restriction is that rows with null partitioning column values at any bound partitioning level must be joined with all partitions at that level. Because that might eliminate much of the performance improvement gained from DPE, exclusion product join with DPE avoids reading the extra rows of the row-partitioned table by placing all spool rows with null partitioning column values into the error partition (meaning the NO RANGE partition in this case). Because expressions on columns containing nulls might produce non-nulls, there is no simple method of grouping all rows with nulls into one partition other than using the RANGE_N function with a simple column test expression.

To ensure that rows with null partitioning columns are placed in the error partition, the system modifies qualifying RANGE_N expressions to remove both the UNKNOWN and NO RANGE partitions from their definition when they are grouped together as NO RANGE OR UNKNOWN. The join algorithm then joins rows in the error partition with all of the rows of the row-partitioned table.

All connecting terms must reference partitioning columns for the Optimizer to choose an exclusion product join with DPE. If additional non-partitioning columns are components of the connecting conditions, then the Optimizer does not select the exclusion product join with DPE algorithm.

Teradata Database performs the exclusion product join with DPE as follows:

  1. Build a list of row partitions that includes all uneliminated partitions in the row-partitioned table.

    This is the unvisited row partition list, which must be joined with the error partition after all DPE partitions have been joined.

  2. Read the first row in the first row partition of the spool.
  3. Dynamically build the list of row partitions in t2 that this current spool row partition is to join.

    Remove this set of row partitions from the unvisited row partition list.

  4. Read the first row in the first DPE partition of the row-partitioned table.
  5. Compare the row with all rows in the current spool row partition and with all rows in the error partition.
    IF there is … THEN …
    a match go to stage 6.
    no match return the row of the row-partitioned table.
  6. Read the next row of the row-partitioned table in the participating DPE partitions.
    IF there are … THEN go to stage …
    no more rows 7.
    more rows 5.
  7. Read the first row in the spool with a partition number greater than the current spool partition.
    IF there … THEN go to stage …
    are no such rows 8.
    is such a row 5.
  8. Read the first row in the row-partitioned table not eliminated by the unvisited row partition list.
  9. IF … THEN …
    all of the connecting term columns are null do not return the row.

    Go to stage 12.

    some, but not all, of the connecting term columns are null join the row with the rows in the spool.

    Go to stage 11.

    none of the connecting term columns is null Go to stage 10.
  10. Join this row with all rows in the error partition of the spool.
  11. If no match is found, return the row.
  12. Read the next row in the row-partitioned table that was not eliminated by the unvisited partition list. If there is such a row, then go to stage 9.