Summary of Primary Index Selection Criteria | VantageCloud Lake - Summary of Primary Index Selection Criteria - Teradata VantageCloud Lake

Lake - Database Reference

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ohi1683672393549.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
ohi1683672393549

The following table summarizes the guidelines for selecting columns to be used as primary indexes.

Guideline Comments
Select columns that are most frequently used to access rows. Restrict selection to columns that are either unique or highly singular.
Select columns that are most frequently used in equality predicate conditions. Equality conditions permit the system to hash directly to the row having the conditional value. When the primary index is unique, the response is never more than one row.

Inequality conditions require additional processing.

Select columns that distribute rows evenly across the AMPs. Distinct values distribute evenly across all AMPs in the configuration. This maximizes parallel processing.

Rows having duplicate NUPI values hash to the same AMP and often are stored in the same data block. This is good when rows are only moderately nonunique.

Rows having NUPI columns that are highly nonunique distribute unevenly, use multiple data blocks, and incur multiple I/Os.

Extremely nonunique primary index values can skew space usage so that the system returns an incorrect message indicating that the database is full.

This occurs when an AMP exceeds the maximum bytes threshold for a user or database calculated by dividing the PERMANENT = n BYTES specification by the number of AMPs in the configuration, causing the system to incorrectly perceive the database to be “full.”

Select columns that are not volatile. Volatile columns force frequent row redistribution.
Select columns having significantly more distinct values than the number of AMPs in the configuration. If this guideline is not followed, row distribution skews heavily, not only wasting disk space, but also devastating system performance.

This rule is particularly important for large tables.

Do not select columns defined with Period, Geospatial, JSON, XML, BLOB, CLOB, data types. You cannot specify columns that have BLOB, CLOB, Period, Geospatial, or JSON data types in a primary index definition. If you attempt to do so, the CREATE request aborts.

You can, however, specify Period data type columns in the partitioning expression of a partitioned table.

Do not select aggregated columns of a join index. When defining the primary index for a join index, you cannot specify any aggregated columns.

If you attempt to do so, the CREATE JOIN INDEX request aborts.