Single-Window Merge Join | Join Planning/Optimization | Teradata Vantage - Single-Window Merge Join - Advanced SQL Engine - Teradata Database

SQL Request and Transaction Processing

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-24
dita:mapPath
ykx1561500561173.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1142
lifecycle
previous
Product Category
Teradata Vantageā„¢

About the Single-Window Merge Join

A single-window merge join is very similar to a sliding-window merge join (see Sliding-Window Merge Join) except that the number of populated partitions after static or delayed row partition elimination has been applied for each table in the join is small enough to be able to handle all the partitions at once. This could be either a join of a PPI table to a PPI table, a PPI table to a nonpartitioned primary-indexed table, a nonpartitioned primary-indexed table to a PPI table, a PPI table to a spool, or a spool to a PPI table.

The two relations must be joined with equality conditions on the primary index. There must be a limited number of participating row partitions. The number of participating row partitions is estimated based on the estimated number of populated row partitions after any static partition elimination. The maximum number of combined partitions processed in as a single set is determined in the same way as for single-level row partitioning.

This calculation is based on the setting of the DBS Control field PPICacheThrP.

The Optimizer estimates whether the set will be small enough; if the estimate differs from the actual data, a sliding-window merge join might or might not be used. For example, the estimate might indicate a single-window merge join is the best plan but a sliding-window merge join may be done if many populated row partitions exist after static or delayed row partition elimination has been applied. The Optimizer might also estimate that the cost of a single-window merge join would exceed that of some other join method, and so would choose to use that method instead.

A single-window merge join must first spool and then sort the column-partitioned relation when one of the relations is column-partitioned.

Single-window merge joins can be used for both non-character-partitioned tables and for character-partitioned tables.

In the following example, the system joins 2 tables on their primary indexes. The WHERE clause conditions eliminate all but two partitions for level 1, and 3 partitions for level 2 of orders. The WHERE clause conditions also eliminate all but 1 partition for level 1, and 7 partitions for level 2 of lineitem. After the system applies these conditions, it must join 6 partitions of the combined expression for orders to 7 partitions of the combined partitioning expression of lineitem, making a total of 13 partitions.

Because of this situation, the system is able to join the 2 tables using a direct merge join, assuming the Optimizer estimates it to be the most cost effective join method. When joining the two sets of partitions, the direct merge join operation handles the row sets logically as if they are in hash order by maintaining a memory-resident data block for each populated partition.

The definition DDL statement text for the 2 tables to be joined in this example query is as follows:

CREATE TABLE orders (
  o_orderkey      INTEGER NOT NULL,
  o_custkey       INTEGER,
  o_orderstatus   CHARACTER(1) CASESPECIFIC,
  o_totalprice    DECIMAL(13,2) NOT NULL,
  o_orderdate     DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  o_orderpriority CHARACTER(21),
  o_clerk         CHARACTER(16),
  o_shippriority  INTEGER,
  o_comment       VARCHAR(79))
PRIMARY INDEX (o_orderkey)
PARTITION BY (
RANGE_N(o_custkey   BETWEEN 0
                    AND 49999
                    EACH  100),
RANGE_N(o_orderdate BETWEEN DATE '2000-01-01'
                    AND DATE '2006-12-31'
                    EACH INTERVAL '1' MONTH))
UNIQUE INDEX (o_orderkey);
CREATE TABLE lineitem (
  l_orderkey      INTEGER NOT NULL,
  l_partkey       INTEGER NOT NULL,
  l_suppkey       INTEGER,
  l_linenumber    INTEGER,
  l_quantity      INTEGER NOT NULL,
  l_extendedprice DECIMAL(13,2) NOT NULL,
  l_discount      DECIMAL(13,2),
  l_tax           DECIMAL(13,2),
  l_returnflag    CHARACTER(1),
  l_linestatus    CHARACTER(1),
  l_shipdate      DATE FORMAT 'yyyy-mm-dd',
  l_commitdate    DATE FORMAT 'yyyy-mm-dd',
  l_receiptdate   DATE FORMAT 'yyyy-mm-dd',
  l_shipinstruct  VARCHAR(25),
  l_shipmode      VARCHAR(10),
  l_comment       VARCHAR(44))
PRIMARY INDEX (l_orderkey)
PARTITION BY (
RANGE_N(l_suppkey  BETWEEN 0
                   AND  4999
                   EACH   10),
RANGE_N(l_shipdate BETWEEN DATE '2000-01-01'
                   AND     DATE '2006-12-31'
                   EACH INTERVAL '1' MONTH));

The Optimizer applies a single-window merge join to join the orders and lineitem tables when creating a join plan for the following query:

SELECT *
FROM orders INNER JOIN lineitem
WHERE o_orderkey = l_orderkey
  AND o_orderdate BETWEEN DATE '2005-04-01'
                  AND     DATE '2005-06-30'
  AND o_custkey   IN (618, 973)
  AND l_shipdate  BETWEEN DATE '2005-04-01'
                  AND     DATE '2005-10-31'
  AND l_suppkey = 4131;