16.10 - Distribution Demographics - Teradata Database

Teradata Database Design

Product
Teradata Database
Release Number
16.10
Release Date
June 2017
Content Type
User Guide
Publication ID
B035-1094-161K
Language
English (United States)

The primary guideline for selecting a primary index column set is to achieve an even distribution of rows across the AMPs.

The more singular the values of a column chosen as the primary index, the more even the distribution of table rows across the AMPs of a system. A unique index is ideal for ensuring optimal distribution. When distribution is optimized, so is parallel processing.

While it is true that the ideal primary index for a table both optimizes retrieval and distribution, the reality is that you are often faced with trading one off against the other. It is not exceedingly rare for a particular primary index to provide maximal access, but poor distribution (or vice versa).

Distribution Guidelines

The key to solving such a dilemma is to prototype several possibilities to see which is the best choice. You might have to make do with an optimal, rather than a maximal, solution.

The goal of this guideline is to optimize row distribution without worsening row accessibility.

With respect to partitioned primary indexes, the granularity of partitioning can be a significant factor in the general effectiveness of the index. The more row partitions, the finer the ability to eliminate them. At the same time, too fine a granularity can have a negative effect on primary index access as well as joins and aggregations on the primary index. See Row-Partitioned and Nonpartitioned Primary Index Access for Typical Operations for a detailed comparison of the various access mechanisms for the two primary index types over a range of typical operations and conditions.