Single-Window Merge Join | Join Planning/Optimization | Teradata Vantage - Single-Window Merge Join - Analytics Database - Teradata Vantage

SQL Request and Transaction Processing

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
zfm1628111633230.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
evd1472255317510
lifecycle
latest
Product Category
Teradata Vantageā„¢

A single-window merge join is similar to a sliding-window merge join (see Sliding-Window Merge Join) except that the number of populated partitions after static or delayed row partition elimination has been applied for each table in the join is small enough to be able to handle all the partitions at once. This could be either a join of a PPI table to a PPI table, a PPI table to a nonpartitioned primary-indexed table, a nonpartitioned primary-indexed table to a PPI table, a PPI table to a spool, or a spool to a PPI table.

The two relations must be joined with equality conditions on the primary index. There must be a limited number of participating row partitions. The number of participating row partitions is estimated based on the estimated number of populated row partitions after any static partition elimination. The maximum number of combined partitions processed in as a single set is determined in the same way as for single-level row partitioning.

This calculation is based on the setting of the DBS Control field PPICacheThrP.

The Optimizer estimates whether the set is small enough. If the estimate differs from the actual data, a sliding-window merge join may be used. For example, the estimate may indicate a single-window merge join is the best plan, but a sliding-window merge join may be done if populated row partitions exist after static or delayed row partition elimination has been applied. The Optimizer may also estimate that the cost of a single-window merge join exceeds that of another join method, and therefore chooses that method instead.

A single-window merge join must first spool and then sort the column-partitioned relation when one of the relations is column-partitioned.

Single-window merge joins can be used for both non-character-partitioned tables and for character-partitioned tables.

In the following example, the system joins 2 tables on their primary indexes. The WHERE clause conditions eliminate all but two partitions for level 1, and 3 partitions for level 2 of orders. The WHERE clause conditions also eliminate all but 1 partition for level 1, and 7 partitions for level 2 of lineitem. After applying these conditions, the system must join 6 partitions of the combined expression for orders to 7 partitions of the combined partitioning expression of lineitem, making a total of 13 partitions.

Because of this situation, the system is able to join the 2 tables using a direct merge join, assuming the Optimizer estimates that join method to be the most cost effective. When joining the two sets of partitions, the direct merge join operation handles the row sets as if the sets are in hash order by maintaining a memory-resident data block for each populated partition.

The definition DDL statement text for the 2 tables to be joined in this example query is as follows:

CREATE TABLE orders (
  o_orderkey      INTEGER NOT NULL,
  o_custkey       INTEGER,
  o_orderstatus   CHARACTER(1) CASESPECIFIC,
  o_totalprice    DECIMAL(13,2) NOT NULL,
  o_orderdate     DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  o_orderpriority CHARACTER(21),
  o_clerk         CHARACTER(16),
  o_shippriority  INTEGER,
  o_comment       VARCHAR(79))
PRIMARY INDEX (o_orderkey)
PARTITION BY (
RANGE_N(o_custkey   BETWEEN 0
                    AND 49999
                    EACH  100),
RANGE_N(o_orderdate BETWEEN DATE '2000-01-01'
                    AND DATE '2006-12-31'
                    EACH INTERVAL '1' MONTH))
UNIQUE INDEX (o_orderkey);
CREATE TABLE lineitem (
  l_orderkey      INTEGER NOT NULL,
  l_partkey       INTEGER NOT NULL,
  l_suppkey       INTEGER,
  l_linenumber    INTEGER,
  l_quantity      INTEGER NOT NULL,
  l_extendedprice DECIMAL(13,2) NOT NULL,
  l_discount      DECIMAL(13,2),
  l_tax           DECIMAL(13,2),
  l_returnflag    CHARACTER(1),
  l_linestatus    CHARACTER(1),
  l_shipdate      DATE FORMAT 'yyyy-mm-dd',
  l_commitdate    DATE FORMAT 'yyyy-mm-dd',
  l_receiptdate   DATE FORMAT 'yyyy-mm-dd',
  l_shipinstruct  VARCHAR(25),
  l_shipmode      VARCHAR(10),
  l_comment       VARCHAR(44))
PRIMARY INDEX (l_orderkey)
PARTITION BY (
RANGE_N(l_suppkey  BETWEEN 0
                   AND  4999
                   EACH   10),
RANGE_N(l_shipdate BETWEEN DATE '2000-01-01'
                   AND     DATE '2006-12-31'
                   EACH INTERVAL '1' MONTH));

The Optimizer applies a single-window merge join to join the orders and lineitem tables when creating a join plan for the following query:

SELECT *
FROM orders INNER JOIN lineitem
WHERE o_orderkey = l_orderkey
  AND o_orderdate BETWEEN DATE '2005-04-01'
                  AND     DATE '2005-06-30'
  AND o_custkey   IN (618, 973)
  AND l_shipdate  BETWEEN DATE '2005-04-01'
                  AND     DATE '2005-10-31'
  AND l_suppkey = 4131;