Considerations for Choosing a Primary Index
Selecting the optimum primary index for a table or uncompressed join index is often a complex task because some applications might favor one type of primary index, while other applications might perform more optimally using a different primary index. Tables can have only one primary index, however, so you must select one that best suits the majority of the applications that a table serves. Of course, if the overhead costs justify the expense, you can define multiple join indexes with different primary indexes.
You can always add additional indexes, such as secondary, hash, and join indexes, to facilitate particular applications. Be aware that these indexes all incur various overhead costs, including:
You should always consider these tradeoffs when planning your indexes, then be sure to test them to ensure that the assumptions that lead to your choices are correct. For example, if you design a primary index with even row distribution as your principal criterion, analyze the actual distribution of table rows to ensure that they are evenly distributed.
For many applications, particularly those that use range queries heavily, a partitioned primary index can provide a better solution to resolving these issues than a nonpartitioned primary index because it provides efficient access both via the primary index columns as well as via a constraint on the partitioning columns. As always, you should confirm that the partitioning actually improves query performance by carefully examining EXPLAIN reports and collecting the appropriate statistics.
You should always collect statistics on the PARTITION column and the partitioning columns.
The recommended practice for recollecting statistics is to set appropriate thresholds for recollection using the THRESHOLD options of the COLLECT STATISTICS statement. See “COLLECT STATISTICS in SQL Data Definition Language for details on how to do this.
You should also weigh the costs of the index against the benefits it provides. This is particularly important if you have also defined a USI on the table because additional maintenance is required to enforce uniqueness, thus potentially neutralizing or even reducing the overall performance advantage of the index.
Creating a partitioned table does not guarantee that row-partition elimination plan. A partitioning might not be used for any of the following common reasons:
In some cases, a query plan with partitioning might not perform as well as one without partitioning (see “Row‑Partitioned and Nonpartitioned Primary Index Access for Typical Operations” on page 351 for specific examples).
Various partitioning strategies can be followed.
This task is far easier for single‑level PPIs than for multilevel PPIs, though it can still be thought of as a goal to be approximated as best as possible.
Note that partitioning in this manner can make altering the partitions more difficult.
The most important factors for row partitioning are accessibility and maximization of row partition elimination. In all cases, defining a primary index (or having no primary index) that distributes the rows of the table fairly evenly across the AMPs is critical for efficient parallel processing. See “Evaluating the Relative Merits of Partitioning Versus Not Partitioning” on page 361 for further information.