Criteria for Selecting a Primary Index - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905
The following guidelines and performance considerations apply to selecting a unique or a nonunique column set as the primary index for a table.
  • Choose columns for the primary index based on the selection set most frequently used to retrieve rows from the table even when that set is not unique (if and only if the values of the selection set are equally distributed across the AMPs).
  • Choose columns for the primary index that do not have XML, BLOB, CLOB, BLOB-based UDT, CLOB-based UDT, XML-based UDT, Period, JSON, ARRAY, VARRAY, or VARIANT_TYPE data types.
    Distinct and structured UDT columns are valid components of a primary index, but UDT columns based on internal Teradata UDT types, such as the Period, Geospatial, ARRAY, and VARRAY types, are not.
    Structured/Distinct/Variant UDTs are only supported on the Block File System on the primary cluster. They are not available for the Object File System.
  • Choose columns for the primary index that distribute table rows evenly across the AMPs. The more singular the values for a column, the more optimal their distribution.
  • Choose the minimum number of columns for the primary index, to optimize its generality.

    All the columns in a composite primary index must be specified in a WHERE clause predicate before the Optimizer can select that index for use as the retrieval mechanism.

  • If defining a unique primary index for a table that must have one is difficult, generate arbitrary unique values for a single column that is defined as an identity column with the characteristics ALWAYS GENERATED and NO CYCLE.
  • Base the column selection on an equality search (if the primary index is a PPI, then the search is done within each non-eliminated populated partition). For equality constraints only, the system hashes directly to the row set that satisfies the condition.
    Primary Key Primary Index Assignee
    Single-column Primary key.

    This is called a Unique Primary Index (UPI).

    Multicolumn A foreign key component of the primary key.

    This is called a Nonunique Primary Index (NUPI).

  • Primary and other alternate key column sets can provide useful uniqueness constraints and a powerful access and join method when the logical design for a table is physically realized. If the primary or other alternate keys for a table are not its primary index, consider assigning a unique constraint, such as PRIMARY INDEX, UNIQUE, or a USI on those keys if the uniqueness constraint facilitate table access and joins.

    This recommendation is contingent on factors to consider before implementing unique constraints. See Using Unique Secondary Indexes to Enforce Row Uniqueness for a list of the factors to consider when implementing this recommendation.

    UPI NUPI
    Involves at most one row Can involve multiple rows
    Does not require a spool Often creates a spool
  • Duplicate NUPI values are stored on the same AMP and in the same data block if possible.
  • NUPI retrieval only requires one I/O operation (or two I/Os if the cylinder index is not memory-resident) when the rows are stored in the same data block.
    Value Range Predicate in WHERE Clause Retrieval Action
    Implicit BETWEEN Full table scan, irrespective of any indexes defined for the table.
    The exceptions are the following:
    • PPI tables and join indexes, where row partition elimination can be exploited.
    • Join index tables with a value-ordered NUPI, where value ordering can be exploited.
    Explicit IN Individual row hashing.