Criteria for Selecting a Primary Index

Criteria for Selecting a Primary Index - Teradata Vantage

Teradata® VantageCloud Lake

Deployment

VantageCloud

Edition

Lake

Product

Teradata Vantage

Published

January 2023

Language

English (United States)

Last Update

2024-04-03

dita:mapPath

phg1621910019905.ditamap

dita:ditavalPath

pny1626732985837.ditaval

dita:id

phg1621910019905

The following guidelines and performance considerations apply to selecting a unique or a nonunique column set as the primary index for a table.

Choose columns for the primary index based on the selection set most frequently used to retrieve rows from the table even when that set is not unique (if and only if the values of the selection set are equally distributed across the AMPs).
Choose columns for the primary index that do not have XML, BLOB, CLOB, BLOB-based UDT, CLOB-based UDT, XML-based UDT, Period, JSON, ARRAY, VARRAY, or VARIANT_TYPE data types.
Distinct and structured UDT columns are valid components of a primary index, but UDT columns based on internal Teradata UDT types, such as the Period, Geospatial, ARRAY, and VARRAY types, are not.
Structured/Distinct/Variant UDTs are only supported on the Block File System on the primary cluster. They are not available for the Object File System.
Choose columns for the primary index that distribute table rows evenly across the AMPs. The more singular the values for a column, the more optimal their distribution.
Choose the minimum number of columns for the primary index, to optimize its generality.
All the columns in a composite primary index must be specified in a WHERE clause predicate before the Optimizer can select that index for use as the retrieval mechanism.
If defining a unique primary index for a table that must have one is difficult, generate arbitrary unique values for a single column that is defined as an identity column with the characteristics ALWAYS GENERATED and NO CYCLE.

Base the column selection on an equality search (if the primary index is a PPI, then the search is done within each non-eliminated populated partition). For equality constraints only, the system hashes directly to the row set that satisfies the condition.

Primary Key	Primary Index Assignee
Single-column	Primary key. This is called a Unique Primary Index (UPI).
Multicolumn	A foreign key component of the primary key. This is called a Nonunique Primary Index (NUPI).

Primary and other alternate key column sets can provide useful uniqueness constraints and a powerful access and join method when the logical design for a table is physically realized. If the primary or other alternate keys for a table are not its primary index, consider assigning a unique constraint, such as PRIMARY INDEX, UNIQUE, or a USI on those keys if the uniqueness constraint facilitate table access and joins.
This recommendation is contingent on factors to consider before implementing unique constraints. See Using Unique Secondary Indexes to Enforce Row Uniqueness for a list of the factors to consider when implementing this recommendation.

UPI NUPI

Involves at most one row Can involve multiple rows

Does not require a spool Often creates a spool
Duplicate NUPI values are stored on the same AMP and in the same data block if possible.

UPI	NUPI
Involves at most one row	Can involve multiple rows
Does not require a spool	Often creates a spool

NUPI retrieval only requires one I/O operation (or two I/Os if the cylinder index is not memory-resident) when the rows are stored in the same data block.

Value Range	Predicate in WHERE Clause	Retrieval Action
Implicit	BETWEEN	Full table scan, irrespective of any indexes defined for the table. The exceptions are the following: PPI tables and join indexes, where row partition elimination can be exploited. Join index tables with a value-ordered NUPI, where value ordering can be exploited.
Explicit	IN	Individual row hashing.