Row partition elimination is a method for enhancing query performance against row partitioned tables by skipping row partitions that do not contain rows that meet the search conditions of a query. Row partition elimination is an automatic optimization in which the Optimizer (or, in the case of dynamic row partition elimination, the AMP software) determines, based on query conditions and a row partitioning expression, that some partitions for that partitioning expression cannot contain qualifying rows; therefore, those row partitions can be skipped during a file scan.
Note that the Optimizer cannot exert all of the optimizations that are possible for each of the individual types of row partition elimination.
Partitions that are skipped for a particular query are called eliminated row partitions.
When there are multiple partitioning expressions, Teradata Database combines row partition elimination at each of the levels to further reduce the number of data subsets that need to be scanned. For most applications, the greatest benefit of row partitioning is obtained from row partition elimination.
Performance Gains Realized From Row Partition Elimination
The performance gain realized from row partitioning depends both on the number of row partitions defined and the specific query being measured. In the optimal case, the query conditions eliminate all but one row partition for each partitioning expression. When there are many thousands of row partitions, with a reasonably even distribution of rows among them, the elapsed time for such a query can be far less than 1% of the time that it would take to run the same query against a nonpartitioned table.
The following table lists the results of actual performance tests. The Baseline column is the performance for a nonpartitioned table, and the partitioning column is the performance for a counterpart table with a single-level partitioning. The tests are realistic, but the results obtained by running the same tests using your configuration and workloads might vary.
|Test Description||Baseline Result (seconds)||Result With PPI (seconds)||Improvement|
|Select rows that have a specified value of the partitioning column from a table having 200 equal-sized partitions.||59||1||98.3% reduction in elapsed time|
|Select a month of activity from one row partition containing 6 months of data from a table having 11 years of data contained in 40 row partitions of unequal size.||58||2||96.5% reduction in elapsed time|
|Delete rows that have a specified value of the partitioning column from a table having 200 equal-sized row partitions.||239||1||99.996% reduction in elapsed time|
|Update one column in each row that has a specified value of the partitioning column from a table having 200 equal-sized row partitions.||237||3||98.7% reduction in elapsed time|
|Test Description||Baseline Result (rows/second/node)||Result With PPI (rows/second/node)||Improvement|
|MultiLoad INSERT a number of rows equal to 1% of the table size into one row partition out of 200.||1,394||14,742||10.58 times faster|
|MultiLoad INSERT a number of rows equal to 1% of the table size into one row partition out of 200 with one NUSI defined on the table.||841||5,666||6.74 times faster|
For the column labelled Baseline Result, the larger the number of rows per second per node processed, the better.
Teradata Database supports several different types of row partition elimination:
When query conditions are such that they allow row partition elimination to be specified by the Optimizer during the early stages of query optimization, the form of row partition elimination used is referred to as static row partition elimination.
When query conditions are based on a comparison derived in part from USING request modifier variables or from the result of a built-in function, it is not possible for the Optimizer to reuse a cached query plan as it would otherwise do because a cached plan needs to be general enough to handle changes in search condition values in subsequent executions.
In this case, the Optimizer applies row partition elimination at a later point in the optimization process, at the time it builds the finalized query plan from a cached plan using the values for this specific execution of the plan. This form of row partition elimination is referred to as delayed row partition elimination.
When query conditions reference values in other tables that would allow row partition elimination, row partition elimination is performed dynamically by the AMP database software after a query has already been optimized and while it is executing. This form of row partition elimination is referred to as dynamic row partition elimination.
Dynamic row partition elimination can also be used to simplify and enhance join performance by selecting the least costly method from a set of join methods especially designed to be used with row partition elimination.
Row partition elimination methods can be mixed within the same query. For example, static row partition elimination can be used for some partition levels, while dynamic row partition elimination can be used for other levels, and some levels might not have any row partition elimination. Some individual row partition levels might even benefit from a mix of multiple forms of row partition elimination.
Teradata Database can eliminate row partitions from search consideration at any number of levels or at a combination of levels.
Any single-table constraints on partitioning columns can be used for static row partition elimination, including those on the system-derived column PARTITION or any of the members of the system-derived PARTITION#L n column set, where the value of n ranges from 1 through 62, inclusive. See SQL Data Definition Language for more information about the system-derived PARTITION columns.
See SQL Request and Transaction Processing for more information about the various forms of row partition elimination and their role in query optimization.