Direct Row-partitioned PI Merge Join | Join Planning/Optimization | Vantage - Direct Row-partitioned PI Merge Join - Analytics Database - Teradata Vantage

SQL Request and Transaction Processing

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
zfm1628111633230.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
evd1472255317510
lifecycle
latest
Product Category
Teradata Vantageā„¢

The term direct merge join describes a join method in which the table or join index of interest is not spooled in preparation for a merge join, but instead is done directly. The Optimizer may choose a direct merge join when at minimum all columns of the primary index are specified in equality join terms.

To qualify for a direct row-partitioned PI merge join, there must be equality conditions on all the primary index columns of the two relations. This applies equally to character and noncharacter row-partitioned PIs. There are multiple forms of this optimization. The particular form selected by the Optimizer depends on factors such as the following:
  • Any additional conditions in the query
  • The total number of row partitions
  • The number of populated row partitions

In the following example, the Optimizer can choose to do a direct merge join of markets and market_penetration instead of redistributing both tables to spool, sorting the spool in hash order of the primary index, and then doing a row hash merge join.

The example uses the following table definitions:

CREATE TABLE markets (
  productid       INTEGER NOT NULL,
  region          BYTEINT NOT NULL,
  activity_date   DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  revenue_code    BYTEINT NOT NULL,
  business_sector BYTEINT NOT NULL,
  note            VARCHAR(256))
PRIMARY INDEX (productid, region)
PARTITION BY (
RANGE_N(region          BETWEEN 1
                        AND     9
                        EACH    3),
RANGE_N(business_sector BETWEEN 0
                        AND    49
                        EACH   10),
RANGE_N(revenue_code    BETWEEN 1
                        AND    34
                        EACH    2),
RANGE_N(activity_date   BETWEEN DATE '1986-01-01'
                        AND     DATE '2007-05-31'
                        EACH INTERVAL '1' MONTH));
CREATE TABLE market_penetration (
  productid       INTEGER NOT NULL,
  region          BYTEINT NOT NULL,
  activity_date   DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  revenue_code    BYTEINT NOT NULL,
  business_sector BYTEINT NOT NULL,
  saturation      FLOAT)
PRIMARY INDEX (productid, region)
PARTITION BY (
RANGE_N(region BETWEEN 1
               AND     9
               EACH    3),
RANGE_N(business_sector BETWEEN 0
                        AND    49
                        EACH   10),
RANGE_N(revenue_code    BETWEEN 1
                        AND    34
                        EACH    2),
RANGE_N(activity_date   BETWEEN DATE '1986-01-01'
                        AND    DATE '2007-05-31'
                        EACH INTERVAL '1' MONTH));

The example request joins markets and market_penetration. Because of the specified conditions, the Optimizer is able to select a direct row-partitioned PI-to-row-partitioned PI merge join to join the relations.

SELECT a.*, b.saturation
FROM   markets AS a INNER JOIN market_penetration AS b
WHERE  a.productid       = b.productid
AND    a.region          = b.region
AND    a.business_sector = b.business_sector
AND    a.revenue_code    = b.revenue_code
AND    a.activity_code   = b.activity_code;