15.00 - Primary Index Types - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Primary Index Types

Except for nonpartitioned NoPI tables (see “NoPI Tables, Column‑Partitioned Tables, and Column-Partitioned Join Indexes” on page 280), column‑partitioned tables and join indexes, and global temporary trace tables, each Teradata Database table requires a primary index to distribute table rows to the AMPs based on their primary index values.

Nonpartitioned NoPI table rows and column‑partitioned table and join index rows are distributed using a different mechanism (see “Row Allocation for Teradata Parallel Data Pump” on page 237 for both nonpartitioned NoPI and column‑partitioned tables and “Row Allocation for FastLoad Operations Into Nonpartitioned NoPI Tables” on page 238 for nonpartitioned NoPI table rows only.

Teradata Database provides the following primary index types:

  • Unique primary index (UPI)
  • Nonunique primary index (NUPI)
  • The NUPI for a join index can be locally value‑ordered in some cases. See the description of the CREATE JOIN INDEX statement in SQL Data Definition Language for details.

  • Nonpartitioned primary index
  • Row‑partitioned primary index (PPI)
  • Partitioned primary indexes fall into two general categories, depending on whether they have a single partitioning expression or multiple partitioning expressions:

  • Single‑level
  • Multilevel
  • These can be mixed in any cross‑dimensional combination, so the following complete specifications for a primary index can all be defined in Teradata Database:

  • Unique nonpartitioned primary index
  • Nonunique nonpartitioned primary index
  • Unique single‑level partitioned primary index
  • Unique multilevel partitioned primary index
  • Nonunique single‑level partitioned primary index
  • Nonunique multilevel partitioned primary index
  • Primary Index Dimensions

    Primary indexes are defined over two orthogonal dimensions: unique versus nonunique and partitioned versus nonpartitioned.

    Unique Versus Nonunique Dimension

    This choice raises two issues.

  • The possibility of not distributing the rows for a table evenly across the AMPs
  • The necessity of checking for duplicate rows in a SET table with a NUPI (unless there is also a USI defined on the table).
  • The most likely scenario causing an uneven row distribution would be if you did not select a nearly unique column to be a nonunique primary index, in which case the distribution of rows for the table across the AMPs or across hash values can be uneven (sometimes the words “lumpy” or “skewed” are used to describe this kind of distribution).

    Do not interpret this to mean that NUPIs necessarily, or even usually, cause the distribution of rows across the AMPs to be skewed. Particularly when the number of rows in a table is very large, it is possible to achieve an even distribution of rows with a NUPI, even if it is not nearly unique.

    Skewed distributions also occur for unique primary indexes whenever the number of rows in a table is small relative to the number of AMPs in the configuration, particularly when the following situations occur.

  • There are fewer rows than AMPs.
  • There are very few unique values (not more than the number of AMPs), even when there are many rows.
  • Excessively uneven distribution of rows causes performance problems, particularly for large tables, and should be avoided.

    Depending on table size and other factors, duplicate row checks on NUPI SET tables can have a significant negative impact on performance.

    Partitioned Versus Nonpartitioned Dimension

    Row‑partitioned primary indexes are defined for tables that are frequent targets of range or row‑partition‑based queries. A common example of this is queries that involve date ranges.

    As for a primary index, partitioning is specified either with the CREATE TABLE statement when the table is created or with the ALTER TABLE statement after the table has already been created. The rows of a table having a PPI are hashed to the same AMPs they would be hashed to if the primary index were not partitioned. Once assigned to an AMP, PPI rows are further assigned to, and stored in rowhash order within, their respective row partitions.

    Nonpartitioned primary indexes behave identically with the exception that they are assigned to one single row partition (number 0) after they are hashed to their AMP.

    Nonpartitioned primary index rows are stored in rowhash order on their respective AMPs.

    See Chapter 9: “Primary Indexes and NoPI Objects,” for additional details about partitioned and nonpartitioned primary indexes.