15.00 - Partitioning Guidelines - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Partitioning Guidelines

The following guidelines provide a high‑level set of criteria for making an initial evaluation of whether row partitioning would provide more benefits to a query workload than a nonpartitioned table.

  • Large tables and join indexes are usually better candidates for row partitioning than smaller tables and join indexes because there is not much benefit to partitioning a table or join index small enough that a full‑table scan on the nonpartitioned table or join index takes only a few seconds.
  • The exception to this is a small table that is row‑partitioned identically to a larger table with which it is frequently joined and with which it shares its primary index.

  • When possible, you should row‑partition on sets of columns that are frequently used as query conditions. For example, if half the queries against a table specify a date range that qualifies less than 25% of the rows, then that date column is a good candidate to be the partitioning column for the table.
  • If there is no column that is frequently used as a query condition, then there is probably little or no advantage to row‑partitioning the table.

  • All factors being equal, it is better to partition on a column set that is part of the primary index column set than to partition on a column that is not.
  • The exception to this is if the primary index is rarely, if ever, used for row access or join operations.

  • Keep the number of row partitions relatively small. The key word in this guideline is relatively. The guideline also applies for multilevel partitioning situations, though it is more difficult to achieve for multilevel partitioning because the total number of partitions is a multiplicative factor of the number of partitioning levels defined for the table, so the number of partitions can grow very quickly even when there are few partitions defined for each level (see “Usage Considerations and Rules for PARTITION and PARTITION#Ln Columns” on page 808).
  • The exception to this guideline is if the primary index is rarely, if ever, used for row access or direct merge joins.

    For example, if all the queries against the table access at least one month of activity, there is little or no benefit to partitioning by week or day instead of by month. An exception to this is if bulk data loading times are greatly reduced by a finer partition granularity.

    See “Scenario 4” on page 434 for an example of evaluating these sorts of tradeoffs.

  • Keep the number of partitions small even if you plan to expand predetermined table operations in the future. You can always increase the number of partitions later when they are needed.
  • Note: If you collect and maintain fresh statistics on the PARTITION columns of tables, this consideration is much less important.

    You have greater flexibility with this guideline for single‑level partitioned tables than you do for multilevel partitioned tables because it can be rather complicated to decrease the number of partitions for a multilevel partitioning because the number of combined partitions defined for such a table increases multiplicatively with each partition and with each level defined.

  • The same criteria for selecting the column set for a nonpartitioned table also apply to partitioned tables (see “Principal Criteria for Selecting a Primary Index” on page 407 and “Secondary Considerations for Selecting a Primary Index” on page 416).
  • Choose a primary index column set that provides good row distribution, avoids skew, and is commonly used to access individual rows or do not use a primary index.

    Optimal row distribution and frequent access are sometimes conflicting considerations, so you must evaluate their relative merits and come to some compromise if that is the case.