Row partition elimination can occur with conditions comparing a partitioning column to a USING request modifier variable, including host variables, or with built-in function such as CURRENT_DATE or CURRENT_TIMESTAMP. Teradata Database might not cache such plans because cached plans must be suitably generalizable to handle changes in the cached expressions in subsequent executions of the plan (see Parameterized Requests). However, there are still optimization opportunities available for such conditions. For example, in certain cases, the system can delay row partition elimination until it builds the finalized plan from a cached plan using the values for this specific execution of the plan.
Equality Conditions That Define a Single Combined Partition
Delayed row partition elimination occurs for USING request modifier variables and built-in functions for equality over all row partitioning levels on all partitioning columns. In other words, constraints must exist such that a single combined partition of the combined partitioning expression is specified when the system applies the values of the USING request modifier variable or built-in function to build a finalized plan from a cacheable plan.
Some of the equality conditions might be constant conditions. No restrictions are defined on the form of the partitioning expressions; however, the following restriction does exist on the form of an equality constraint: it must be a simple equality condition between a partitioning column and any of the following:
- USING request modifier variable
- Built-in function
- Constant expression
There must be an ANDed equality condition on the partitioning columns such that a single combined row partition of the combined partitioning expression is specified.
For example, consider the following table definition:
CREATE TABLE markets ( productid INTEGER NOT NULL region BYTEINT NOT NULL activity_date DATE FORMAT 'yyyy-mm-dd' NOT NULL revenue_code BYTEINT NOT NULL business_sector BYTEINT NOT NULL note VARCHAR(256) PRIMARY INDEX (productid, region) PARTITION BY RANGE_N(region BETWEEN 1 AND 9 EACH 3) RANGE_N(business_sector BETWEEN 0 AND 49 EACH 10) RANGE_N(revenue_code BETWEEN 1 AND 34 EACH 2) RANGE_N(activity_date BETWEEN DATE '1986-01-01' AND DATE '2007-05-31' EACH INTERVAL '1' MONTH))
This is the same table that was defined in Examples of Rewrites Using Row Partitioning.
For the sake of simplicity, the comments associated with the example queries make the following assumptions:
- Rows are evenly distributed among the combined partitions.
- There are many data blocks per row partition.
If the example queries were not able to capitalize on delayed row partition elimination, they would all require all-AMP, full-table scans to be processed unless otherwise noted.
This example assumes a 2-byte internal partition number. For the following query, which specifies both a set of USING request modifier values and a built-in function, Teradata Database reads one combined row partition on all AMPs (combined because the conditions in the predicate are on all row partitioning levels of the table). As a result, it reads only 1/65,535 of the data blocks for the markets table because with all partitioning levels specified in the predicate, all 65,535 combined partitions are eligible to be scanned, but Teradata Database needs to read only the one combined row partition of those possible 65,535 combined row partitions to access all of the qualified rows.
The combined row partition might start and end in the middle of data blocks, and those data blocks might include rows whose values for the query predicate do not qualify. Teradata Database necessarily reads those rows as it reads the data blocks for the combined row partition, but because they do not qualify for the query predicate, the system does not return them in the result set.
USING (r BYTEINT, b BYTEINT, rc BYTEINT) SELECT * FROM markets WHERE region=:r AND business_sector=:b AND revenue_code=:rc AND activity_date=CURRENT_DATE;
Delayed Row Partition Elimination and Character-Partitioned Tables
Partition row elimination can occur with conditions comparing a partitioning column to a USING variable, including host variables, or a built-in function such as CURRENT_DATE).
The following additional conditions must be met for the Optimizer to apply delayed row partition elimination for a character-partitioned table:
- There must be at most one character partitioning column at all row partitioning levels.
- If the session collation or table collation is MULTINATIONAL or CHARSET_COLL, and if any comparison or string function involving any non-constant expression in a partitioning expression at any level is case blind, then the session collation must match the table collation.
For delayed row partition elimination, the following are case blind:
- UPPERCASE qualifier
The following are case sensitive:
The concatenation operator is considered to be both case sensitive and case blind.
The following functions follow the same rules as comparison operators, and Teradata Database examines the function inputs along with default case sensitivity for the session mode, either ANSI or Teradata, to determine case sensitivity:
The presence of the SUBSTRING function does not affect case sensitivity.
- If the WHERE clause condition qualifying a level for delayed row partition elimination is case blind, then all comparisons in the partitioning expression for that level must be case blind, and all string functions involving any non-constant expression of the partitioning expression must be case blind.
A WHERE clause condition is considered to be case blind if any of the comparisons or string functions involving non-constant expressions in the condition is case blind.
If the equality condition that qualifies a level for delayed row partition elimination is case blind, then all comparisons at a given row partitioning level must also be case blind. While this must be true for all row partitioning levels, case sensitivity does not need to match between different row partitioning levels.
If a character partitioning level involves any case specific comparisons of character data, then the corresponding ANDed equality conditions involving the partitioning columns for the level must also be case specific.
Assume you have created the following table, which is used in the SELECT request example that follows:
CREATE SET TABLE user_IDs, NO FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT ( user_ID INTEGER, user_name CHARACTER(40) CHARACTER SET UNICODE CASESPECIFIC) PRIMARY INDEX (user_ID) PARTITION BY RANGE_N(user_name BETWEEN 'A','Z','a','y' AND 'zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz', NO RANGE OR UNKNOWN);
For the sake of simplicity, assume an even distribution of the rows among the row partitions and many data blocks per row partition. Without this form of delayed row partition elimination, the request would require an all-AMP, full-table scan to return the requested result set.
Because it uses delayed row partition elimination, Teradata Database reads one row partition on all AMPs. The system substitutes the value of the built-in function USER for user_name; therefore, Teradata Database only needs to read roughly 1/5 of the data blocks for user_IDs.
Because the data blocks containing the row partition include some non-qualifying rows in the first and last of these data blocks, Teradata Database must also read them as it reads the row partition, but it does not return those rows as part of the result set.
SELECT * FROM user_IDs WHERE user_name=USER;