Delayed Row Partition Elimination | Optimizer Process | Teradata Vantage - Delayed Row Partition Elimination - Advanced SQL Engine - Teradata Database

SQL Request and Transaction Processing

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-24
dita:mapPath
ykx1561500561173.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1142
lifecycle
previous
Product Category
Teradata Vantageā„¢

Row partition elimination can occur with conditions comparing a partitioning column to a USING request modifier variable, including host variables, or with built-in function such as CURRENT_DATE or CURRENT_TIMESTAMP. The database might not cache such plans because cached plans must be suitably generalizable to handle changes in the cached expressions in subsequent executions of the plan (see Parameterized Requests). However, there are still optimization opportunities available for such conditions. For example, in certain cases, the system can delay row partition elimination until it builds the finalized plan from a cached plan using the values for this specific execution of the plan.

Equality Conditions That Define a Single Combined Partition

Delayed row partition elimination occurs for USING request modifier variables and built-in functions for equality over all row partitioning levels on all partitioning columns. In other words, constraints must exist such that a single combined partition of the combined partitioning expression is specified when the system applies the values of the USING request modifier variable or built-in function to build a finalized plan from a cacheable plan.

Some of the equality conditions might be constant conditions. No restrictions are defined on the form of the partitioning expressions; however, the following restriction does exist on the form of an equality constraint: it must be a simple equality condition between a partitioning column and any of the following:
  • USING request modifier variable
  • Built-in function
  • Constant expression

There must be an ANDed equality condition on the partitioning columns such that a single combined row partition of the combined partitioning expression is specified.

For example, consider the following table definition:

CREATE TABLE markets (
  productid       INTEGER NOT NULL
  region          BYTEINT NOT NULL
  activity_date   DATE FORMAT 'yyyy-mm-dd' NOT NULL
  revenue_code    BYTEINT NOT NULL
  business_sector BYTEINT NOT NULL
  note            VARCHAR(256)
PRIMARY INDEX (productid, region)
PARTITION BY RANGE_N(region          BETWEEN 1
                                     AND     9
                                     EACH    3)
             RANGE_N(business_sector BETWEEN 0 
                                     AND    49
                                     EACH   10)
             RANGE_N(revenue_code    BETWEEN 1
                                     AND    34
                                     EACH    2)
             RANGE_N(activity_date   BETWEEN DATE '1986-01-01'
                                     AND     DATE '2007-05-31'
                                     EACH INTERVAL '1' MONTH))

This is the same table that was defined in Examples of Rewrites Using Row Partitioning.

For the sake of simplicity, the comments associated with the example queries make the following assumptions:
  • Rows are evenly distributed among the combined partitions.
  • There are many data blocks per row partition.

If the example queries were not able to capitalize on delayed row partition elimination, they would all require all-AMP, full-table scans to be processed unless otherwise noted.

This example assumes a 2-byte internal partition number. For the following query, which specifies both a set of USING request modifier values and a built-in function, the database reads one combined row partition on all AMPs (combined because the conditions in the predicate are on all row partitioning levels of the table). As a result, it reads only 1/65,535 of the data blocks for the markets table because with all partitioning levels specified in the predicate, all 65,535 combined partitions are eligible to be scanned, but the database needs to read only the one combined row partition of those possible 65,535 combined row partitions to access all of the qualified rows.

The combined row partition might start and end in the middle of data blocks, and those data blocks might include rows whose values for the query predicate do not qualify. The database necessarily reads those rows as it reads the data blocks for the combined row partition, but because they do not qualify for the query predicate, the system does not return them in the result set.

USING (r BYTEINT, b BYTEINT, rc BYTEINT)
SELECT *
FROM markets
WHERE region=:r
AND   business_sector=:b
AND   revenue_code=:rc
AND   activity_date=CURRENT_DATE;

Delayed Row Partition Elimination and Character-Partitioned Tables

Partition row elimination can occur with conditions comparing a partitioning column to a USING variable, including host variables, or a built-in function such as CURRENT_DATE).

The following additional conditions must be met for the Optimizer to apply delayed row partition elimination for a character-partitioned table:
  • There must be at most one character partitioning column at all row partitioning levels.
  • If the session collation or table collation is MULTINATIONAL or CHARSET_COLL, and if any comparison or string function involving any non-constant expression in a partitioning expression at any level is case insensitive, then the session collation must match the table collation.

    For delayed row partition elimination, the following are case-insensitive:

    • LOWER
    • SOUNDEX
    • UPPER
    • UPPERCASE qualifier

    The following are case-sensitive:

    • CHAR2HEXINT
    • TRANSLATE
    • TRANSLATE_CHK
    • VARGRAPHIC

    The concatenation operator is considered to be both case-sensitive and case-insensitive.

    The following functions follow the same rules as comparison operators, and the database examines the function inputs along with default case sensitivity for the session mode, either ANSI or Teradata, to determine case sensitivity:
    • INDEX
    • MINDEX
    • POSITION

    The presence of the SUBSTRING function does not affect case sensitivity.

  • If the WHERE clause condition qualifying a level for delayed row partition elimination is case-insensitive, then all comparisons in the partitioning expression for that level must be case-insensitive, and all string functions involving any non-constant expression of the partitioning expression must be case insensitive.

    A WHERE clause condition is considered to be case-insensitive if any of the comparisons or string functions involving non-constant expressions in the condition is case-insensitive.

If the equality condition that qualifies a level for delayed row partition elimination is case-insensitive, then all comparisons at a given row partitioning level must also be case-insensitive. While this must be true for all row partitioning levels, case sensitivity does not need to match between different row partitioning levels.

If a character partitioning level involves any case specific comparisons of character data, then the corresponding ANDed equality conditions involving the partitioning columns for the level must also be case specific.

Assume you have created the following table, which is used in the SELECT request example that follows:

CREATE SET TABLE user_IDs, NO FALLBACK, NO BEFORE JOURNAL,
                           NO AFTER JOURNAL, CHECKSUM = DEFAULT (
  user_ID   INTEGER,
  user_name CHARACTER(40) CHARACTER SET UNICODE CASESPECIFIC)
PRIMARY INDEX (user_ID)
PARTITION BY RANGE_N(user_name BETWEEN 'A','Z','a','y' AND
                       'zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz',
             NO RANGE OR UNKNOWN);

For the sake of simplicity, assume an even distribution of the rows among the row partitions and many data blocks per row partition. Without this form of delayed row partition elimination, the request would require an all-AMP, full-table scan to return the requested result set.

Because it uses delayed row partition elimination, the database reads one row partition on all AMPs. The system substitutes the value of the built-in function USER for user_name; therefore, the database only needs to read roughly 1/5 of the data blocks for user_IDs.

Because the data blocks containing the row partition include some non-qualifying rows in the first and last of these data blocks, the database must also read them as it reads the row partition, but it does not return those rows as part of the result set.

SELECT *
FROM user_IDs
WHERE user_name=USER;