Summary of Primary Index Selection Criteria | Teradata Vantage - Summary of Primary Index Selection Criteria - Teradata Vantage - Analytics Database

Database Design

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ogg1628096130566.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
zqc1472244571611
lifecycle
lifecycle
Product Category
Teradata Vantage™

The following table summarizes the guidelines for selecting columns to be used as primary indexes.

Guideline Comments
Select columns that are most frequently used to access rows. Restrict selection to columns that are either unique or highly singular.
Select columns that are most frequently used in equality predicate conditions. Equality conditions permit the system to hash directly to the row having the conditional value. When the primary index is unique, the response is never more than one row.

Inequality conditions require additional processing.

Select columns that distribute rows evenly across the AMPs. Distinct values distribute evenly across all AMPs in the configuration. This maximizes parallel processing.

Rows having duplicate NUPI values hash to the same AMP and often are stored in the same data block. This is good when rows are only moderately nonunique.

Rows having NUPI columns that are highly nonunique distribute unevenly, use multiple data blocks, and incur multiple I/Os.

Extremely nonunique primary index values can skew space usage so markedly that the system returns a message indicating that the database is full even when it is not.

This occurs when an AMP exceeds the maximum bytes threshold for a user or database calculated by dividing the PERMANENT = n BYTES specification by the number of AMPs in the configuration, causing the system to incorrectly perceive the database to be “full.”

Select columns that are not volatile. Volatile columns force frequent row redistribution.
Select columns having very many more distinct values than the number of AMPs in the configuration. If this guideline is not followed, row distribution skews heavily, not only wasting disk space, but also devastating system performance.

This rule is particularly important for large tables.

Do not select columns defined with Period, ARRAY, VARRAY, Geospatial, JSON, XML, BLOB, CLOB, XML-based UDT, BLOB-based UDT, or CLOB-based UDT data types. You cannot specify columns that have BLOB, CLOB, BLOB-based UDT, CLOB-based UDT, XML-based UDT, Period, ARRAY, VARRAY, Geospatial, or JSON data types in a primary index definition. If you attempt to do so, the CREATE request aborts.

You can, however, specify Period data type columns in the partitioning expression of a partitioned table.

Do not select aggregated columns of a join index. When defining the primary index for a join index, you cannot specify any aggregated columns.

If you attempt to do so, the CREATE JOIN INDEX request aborts.