Selecting the optimum primary index for a table or uncompressed join index is often a complex task because applications may favor different types of primary index or perform better using different primary indexes. Tables can have only one primary index, so you must select one that best suits the majority of the applications that a table serves. If the overhead costs justify the expense, you can define multiple join indexes with different primary indexes.
- Disk space required to store their subtables.
- System performance degrades whenever base table rows are updated because the index values for any indexed columns affected by that update must also be updated.
Consider these tradeoffs when planning your indexes, then test to make sure that the assumptions that lead to your choices are correct. For example, if you design a primary index with even row distribution as your principal criterion, check that the actual distribution of table rows is even.
For applications that use range queries heavily, a partitioned primary index can provide a better solution to resolving these issues than a nonpartitioned primary index, because a partitioned primary index provides efficient access both using the primary index columns and using a constraint on the partitioning columns. Confirm that the partitioning improves query performance by carefully examining EXPLAIN reports and collecting the appropriate statistics.
Collect statistics on the PARTITION column and the partitioning columns.
The recommended practice for recollecting statistics is to set appropriate thresholds for recollection using the THRESHOLD options of the COLLECT STATISTICS statement. For details, see COLLECT STATISTICS (Optimizer Form).
Weigh the costs of the index against its benefits. This is particularly important if you have also defined a USI on the table because additional maintenance is required to enforce uniqueness, thus potentially neutralizing or even reducing the overall performance advantage of the index.
- The partitioning is not applicable to the actual queries in the workload.
- The Optimizer cost analysis for a query determines that another plan is less expensive.
- The query does not conform to any number of restrictions.
A query plan with partitioning may not perform as well as one without partitioning.
- Defining the partition expressions such that each row partition has approximately the same number of rows.
This task is far easier for single-level PPIs than for multilevel PPIs, but can still be a goal to be approximated as best as possible.
- Varying the number of rows per partition. For example, more frequently accessed data (such as for the current year) may be divided into finer partitions (such as weeks), but other data (such as previous years) may have coarser partitions (such as months or multiples of months).
Partitioning in this manner can make altering the partitions more difficult.
- Alternatively, defining each range with equal width, even if the number of rows per range varies, may be important.
The most important factors for row partitioning are accessibility and maximization of row partition elimination. Defining a primary index that distributes the rows of the table evenly across the AMPs (or having no primary index) is critical for efficient parallel processing.