Direct Row-partitioned PI Merge Join | Join Planning/Optimization | Vantage - Direct Row-partitioned PI Merge Join - Advanced SQL Engine

Direct Row-partitioned PI Merge Join | Join Planning/Optimization | Vantage - Direct Row-partitioned PI Merge Join - Advanced SQL Engine - Teradata Database

SQL Request and Transaction Processing

Product

Advanced SQL Engine

Teradata Database

Release Number

17.10

Published

July 2021

Language

English (United States)

Last Update

2021-07-28

dita:mapPath

uqf1592445067244.ditamap

dita:ditavalPath

uqf1592445067244.ditaval

dita:id

B035-1142

lifecycle

Product Category

Teradata Vantage™

About the Direct Row-partitioned PI Merge Join

The term direct merge join describes a join method in which the table or join index of interest is not spooled in preparation for a merge join, but instead is done directly. The Optimizer might choose a direct merge join when at minimum all columns of the primary index are specified in equality join terms.

To qualify for a direct row-partitioned PI merge join, there must be equality conditions on all the primary index columns of the two relations. This applies equally to character and non-character row-partitioned PIs. There are several forms of this optimization. The particular form selected by the Optimizer depends on factors such as the following:

Any additional conditions in the query
The total number of row partitions
The number of populated row partitions

In the following example, the Optimizer can choose to do a direct merge join of markets and market_penetration instead of redistributing both tables to spool, sorting the spool in hash order of the primary index, and then doing a row hash merge join.

The example uses the following table definitions:

CREATE TABLE markets (
  productid       INTEGER NOT NULL,
  region          BYTEINT NOT NULL,
  activity_date   DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  revenue_code    BYTEINT NOT NULL,
  business_sector BYTEINT NOT NULL,
  note            VARCHAR(256))
PRIMARY INDEX (productid, region)
PARTITION BY (
RANGE_N(region          BETWEEN 1
                        AND     9
                        EACH    3),
RANGE_N(business_sector BETWEEN 0
                        AND    49
                        EACH   10),
RANGE_N(revenue_code    BETWEEN 1
                        AND    34
                        EACH    2),
RANGE_N(activity_date   BETWEEN DATE '1986-01-01'
                        AND     DATE '2007-05-31'
                        EACH INTERVAL '1' MONTH));
CREATE TABLE market_penetration (
  productid       INTEGER NOT NULL,
  region          BYTEINT NOT NULL,
  activity_date   DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  revenue_code    BYTEINT NOT NULL,
  business_sector BYTEINT NOT NULL,
  saturation      FLOAT)
PRIMARY INDEX (productid, region)
PARTITION BY (
RANGE_N(region BETWEEN 1
               AND     9
               EACH    3),
RANGE_N(business_sector BETWEEN 0
                        AND    49
                        EACH   10),
RANGE_N(revenue_code    BETWEEN 1
                        AND    34
                        EACH    2),
RANGE_N(activity_date   BETWEEN DATE '1986-01-01'
                        AND    DATE '2007-05-31'
                        EACH INTERVAL '1' MONTH));

The example request joins markets and market_penetration. Because of the specified conditions, the Optimizer is able to select a direct row-partitioned PI-to-row-partitioned PI merge join to join the relations.

SELECT a.*, b.saturation
FROM   markets AS a INNER JOIN market_penetration AS b
WHERE  a.productid       = b.productid
AND    a.region          = b.region
AND    a.business_sector = b.business_sector
AND    a.revenue_code    = b.revenue_code
AND    a.activity_code   = b.activity_code;