The primary index for a table controls the distribution and retrieval of the data for that table across the AMPs. Both distribution and retrieval of the data is controlled using the Teradata Database hashing algorithm (see “Row Hash and RowID” on page 28).
If the primary index is defined as a partitioned primary index (PPI), the data is partitioned, based on some set of columns, on each AMP, and ordered by the hash of the primary index columns within the partition.
Data accessed based on a primary index is always a one-AMP operation because a row and its index are stored on the same AMP. This is true whether the primary index is unique or nonunique, and whether it is partitioned or nonpartitioned.
In general, most Teradata Database tables require a primary index. (Some tables in the Data Dictionary do not have primary indexes and a global temporary trace table is not allowed to have a primary index.) To create a primary index, use the CREATE TABLE statement.
If you do not assign a primary index explicitly when you create a table, Teradata Database assigns a primary index, based on the following rules.
Primary Index |
Primary |
Unique Column Constraint |
Teradata Database selects the … |
no |
YES |
no |
primary key column set to be a UPI. |
no |
no |
YES |
first column or columns having a UNIQUE constraint to be a UPI. |
no |
YES |
YES |
primary key column set to be a UPI. |
no |
no |
no |
first column defined for the table to be a NUPI. If the data type of the first column in the table is a LOB, then the CREATE TABLE operation aborts and the system returns an error message. |
In general, the best practice is to specify a primary index instead of having Teradata Database select a default primary index.
When choosing the primary index for a table, there are two essential factors to keep in mind: uniform distribution of the data and optimal access.
With respect to uniform data distribution, consider the following factors:
With respect to optimal data access, consider the following factors:
For example:
Although it is true that the columns you choose to be the primary index for a table are often the same columns that define the primary key, it is also true that primary indexes often comprise fields that are neither unique nor components of the primary key for the table.
In addition to uniform distribution of data and optimal access considerations, other guidelines and performance considerations apply to selecting a unique or a nonunique column set as the primary index for a table.
Other considerations can include:
For more information on criteria for selecting a primary index, see Database Design.
The decision to define a single-level or multilevel Partitioned Primary Index (PPI) for a table depends on how its rows are most frequently accessed. PPIs are designed to optimize range queries while also providing efficient primary index join strategies and may be appropriate for other classes of queries. Performance of such queries is improved by accessing only the rows of the qualified partitions.
A PPI increases query efficiency by avoiding full table scans without the overhead and maintenance costs of secondary indexes.
The most important factors for PPIs are accessibility and maximization of partition elimination. In all cases, it is critical for parallel efficiency to define a primary index that distributes the rows of the table fairly evenly across the AMPs.
For more information on partitioning considerations, including information on column partitioning, called Teradata Columnar, see Database Design, SQL Data Definition Language Detailed Topics, and SQL Data Manipulation Language.
Restrictions apply to the columns you choose to be the primary index for a table. For partitioned primary indexes, further restrictions apply to the columns you choose to be partitioning columns. Here are some of the restrictions:
Other restrictions apply to the partitioning expression for PPIs. For example:
For details on all the restrictions that apply to primary indexes and partitioned primary indexes, see CREATE TABLE in SQL Data Definition Language.
CREATE INDEX is used only to create secondary indexes.
Some modifications, such as partitioning and primary index columns, require an empty table.
Partitioned primary indexes are not automatically assigned. You must explicitly define a partitioned primary index.
Note that a partitioned primary index can only be unique if all the partitioning columns are also included as primary index columns. If the primary index does not include all the partitioning columns, uniqueness on the primary index columns may be enforced with a unique secondary index on the same columns as the primary index.
For more information on using primary indexes to enhance the performance of your databases, see Database Design.