The term direct merge join describes a join method in which the table or join index of interest is not spooled in preparation for a merge join, but instead is done directly. The Optimizer may choose a direct merge join when at minimum all columns of the primary index are specified in equality join terms.
- Any additional conditions in the query
- The total number of row partitions
- The number of populated row partitions
In the following example, the Optimizer can choose to do a direct merge join of markets and market_penetration instead of redistributing both tables to spool, sorting the spool in hash order of the primary index, and then doing a row hash merge join.
The example uses the following table definitions:
CREATE TABLE markets ( productid INTEGER NOT NULL, region BYTEINT NOT NULL, activity_date DATE FORMAT 'yyyy-mm-dd' NOT NULL, revenue_code BYTEINT NOT NULL, business_sector BYTEINT NOT NULL, note VARCHAR(256)) PRIMARY INDEX (productid, region) PARTITION BY ( RANGE_N(region BETWEEN 1 AND 9 EACH 3), RANGE_N(business_sector BETWEEN 0 AND 49 EACH 10), RANGE_N(revenue_code BETWEEN 1 AND 34 EACH 2), RANGE_N(activity_date BETWEEN DATE '1986-01-01' AND DATE '2007-05-31' EACH INTERVAL '1' MONTH)); CREATE TABLE market_penetration ( productid INTEGER NOT NULL, region BYTEINT NOT NULL, activity_date DATE FORMAT 'yyyy-mm-dd' NOT NULL, revenue_code BYTEINT NOT NULL, business_sector BYTEINT NOT NULL, saturation FLOAT) PRIMARY INDEX (productid, region) PARTITION BY ( RANGE_N(region BETWEEN 1 AND 9 EACH 3), RANGE_N(business_sector BETWEEN 0 AND 49 EACH 10), RANGE_N(revenue_code BETWEEN 1 AND 34 EACH 2), RANGE_N(activity_date BETWEEN DATE '1986-01-01' AND DATE '2007-05-31' EACH INTERVAL '1' MONTH));
The example request joins markets and market_penetration. Because of the specified conditions, the Optimizer is able to select a direct row-partitioned PI-to-row-partitioned PI merge join to join the relations.
SELECT a.*, b.saturation FROM markets AS a INNER JOIN market_penetration AS b WHERE a.productid = b.productid AND a.region = b.region AND a.business_sector = b.business_sector AND a.revenue_code = b.revenue_code AND a.activity_code = b.activity_code;