15.00 - Optimizing Performance by Using Multicolumn NUPIs - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Optimizing Performance by Using Multicolumn NUPIs

You can minimize duplicate row check performance issues by making your NUPIs as close to being unique as possible. The more singular the NUPI value (that is, the closer to being unique it is), the more likely all rows having that NUPI can be stored within a single data block.

A powerful method for achieving the goal of maximal singularity is to define the NUPI on multiple columns. If you decide to use this approach to enhance the uniqueness of a NUPI, keep in mind that you should also define primary indexes on the smallest possible column set (see “Primary Index Value Retrieval Access: Definition” on page 417). The goal is to optimize the tradeoff between enhancing the generality of the index for row retrieval and reducing the number of duplicate row checks that must be performed.

Drawbacks of this method are described in “Advantages and Disadvantages of Multicolumn NUPIs” on page 449.

Consider the following example. Suppose you have a table with 3 name columns:
last_name, first_name, and middle_name. You determine that you must use one or more of the name columns as the NUPI for the table.

1 Start with last_name.

Depending on your demographics, this might be a usable choice. If your demographics indicate that your population has mostly English names, you will probably find numerous Johnsons, Smiths, and Jones among your last name pool.

This would provide a fairly skewed distribution of rows across your AMPs.

2 Now add first_name to last_name.

Given the same population, you are still likely to find multiple names like Robert Smith or Jennifer Johnson, but the singularity of this NUPI is still greatly enhanced over the single column last_name NUPI.

3 Now add middle_name to last_name and first_name.

You might still find duplicate names like Robert David Smith, but the probability is great that there are fewer Robert David Smiths in your population than there are Robert Smiths.

The distribution of rows should be fairly even using this NUPI.