15.00 - Importance of Partition Order for Specifying Partitioning Expressions - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Importance of Partition Order for Specifying Partitioning Expressions

The specification order of partitioning expressions can be important for multilevel partitioning. The system maps multilevel partitioning expressions into a single‑level combined partitioning expression. It then maps the resulting combined partition number 1‑to-1 to an internal partition number. Rows are in logical RowID order, where a RowID consists of an internal partition number, a row hash value, and a row uniqueness value (RowIDs are identical for single-level and multilevel partitioning).

This implies a partial physical ordering based on how the file system manages the data blocks and cylinders, though this is not a strict relationship. This physical ordering maintains the ordering of partitions (except for a possible wraparound of internal numbers at one point in the internal number sequence for each level). Multilevel partitioning expressions are analogous with a single‑level partitioning expression that is identical to the combined partitioning expression for multilevel partitioning, at least in terms of expressing how rows are logically ordered by the file system.

There are several implications of this ordering.

  • Usage
  • You can alter a partition level to change the number of partitions at that level to within 1 and the maximum number of partitions defined for that level when the table is populated with rows.

  • Query performance
  • Row partition elimination at the lowest levels can increase overhead because of the frequent need to skip to the next internal partition to be read. This is because a partition at a lower level is split among the partitions at higher levels in the partition hierarchy.

    At higher levels in the partition hierarchy, there are more contiguous internal partitions to scan and skip.

  • Bulk data load performance
  • If a load is order‑based on one of the partitioning columns, having that partitioning column at the highest level in the partition hierarchy can improve load performance because those partitions are contiguous. For example, a date‑based partition with daily data loads might benefit from having the date‑based partitioning at the first level.

    You define the ordering of the partitioning expressions in your CREATE TABLE SQL text, and that ordering implies the logically ordering by RowID. Because the partitions at each level are distributed among the partitions of the next higher level in the hierarchy, scanning a partition at a certain level requires skipping some internal partitions.

    If the number of rows in each internal partition is large, then skipping to the next internal partition to be read incurs relatively little overhead.

    If the system reads or skips only a few rows for each internal partition, performance might be somewhat worse than a full-table scan (or even significantly worse if a full-table scan were able to get more benefit out of block read-ahead or cylinder reads). You can exploit this ordering for a locality of reference performance benefit (meaning that by increasing the probability that certain partitions fall in the same cylinder by having their partitioning expression at a higher level, you can greatly enhance the performance of the scan).

    Partition expression order does not affect the ability to eliminate partitions, but does affect the efficiency of a partition scan. As a general rule, this should not be a concern if there are many rows, which implies multiple data blocks, in each of the partitions.

    Consider the following 2 cases.

  • For a table with 65,535 combined partitions, the maximum number of combined partitions per partitioned primary index is 6.5535 x 109 (billion) rows per AMP, 100 byte rows, and 50 KB data blocks, and assuming an equal distribution of rows among the partitions, each combined partition spans 200 to 201 data blocks.
  • In this case, skipping over internal partitions should not incur significant overhead. Either all, or nearly all, of the rows in the data blocks read qualify, and the system skips at least 199 data blocks between each set of data blocks read.

  • If the table has only 6.5535 x 106 (million) rows per AMP, each combined partition has about 0.2 data blocks. In the worst case, where the system eliminates only every other partition at the lowest level, every data block is read, and a full‑table scan would be more efficient.
  • If there are 5 partitions at the lowest level, and the system can eliminate 4 out of 5 partitions, it still must read every data block.

    If there are 6 partitions at the lowest level, and the system can eliminate 5 out of the 6, it can skip some data blocks, but possibly not enough to overcome the overhead burden, so this might not be more efficient than a full‑table scan.

    With a large number of partitions at the lowest level, and a large number of eliminated partitions at the lowest level, the system can probably skip enough more data blocks that the overhead burden can be overcome, and the operation might be more efficient than a full‑table scan.

    This second case is somewhat artificial, and probably not a good use of multilevel partitioning. Instead of the demonstrated case, you should consider an alternative partitioning that results in multiple data blocks for each internal partition.

    A partitioning scheme that defines fewer levels of partitioning and fewer partitions per level, where the lowest level has the greatest number of partitions, ensures that there are more rows per combined partition, and would be far more useful. For example, if one level was initially partitioned in intervals of one day, changing the interval to one week or one month might be better.

    Row partition elimination at the lower partition levels of a row partition hierarchy requires more skipping, which can both cause more I/O operations and increase the CPU path length.

    To achieve optimal performance, you should specify a partitioning expression that is more likely to evoke row partition elimination for queries at a higher level and specify those expressions that are not as likely to evoke row partition elimination either at a lower level, or not at all. You should also consider specifying the row partitioning expression with the greatest number of row partitions at the lowest level.

    As previously noted, the order of the row partitioning expressions might not be a significant concern if there are many data blocks per combined partition.

    You can use the following query to find the average number of rows per populated combined row partition.

         SELECT AVG(pc) 
         FROM (SELECT COUNT(*) AS pc
               FROM t 
               GROUP BY PARTITION) AS pt;

    Assuming the average block size is b and the row size is r, you can use the following query to find the average number of data blocks per populated combined row partition. The ideal number of data blocks per populated combined row partition is 10 or more.

         USING (b FLOAT, r FLOAT) 
         SELECT (:r / :b) * AVG(pc) 
         FROM (SELECT COUNT(*) AS pc
               FROM t 
               GROUP BY PARTITION) AS pt;