15.00 - Secondary Indexes - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

prodname
Teradata Database
vrm_release
15.00
category
Programming Reference
featnum
B035-1184-015K

Secondary Indexes

You can define up to 32 secondary, hash, and join indexes (in any combination) for 1 table, and secondary indexes can be either unique (USI) or non‑unique (NUSI). A multicolumn NUSI that specifies an ORDER BY clause counts as 2 consecutive indexes in this calculation.

This includes the system‑defined secondary indexes used to implement PRIMARY KEY and UNIQUE constraints.

UNIQUE and PRIMARY KEY constraints that do not define the primary index for a table also count against this total because they are defined internally as USIs for nontemporal tables. These constraints are defined internally as single‑table join indexes, so any UNIQUE or PRIMARY KEY constraint counts equally against the maximum of 32 secondary, hash, or join indexes per table.

You cannot include BLOB, CLOB, BLOB‑based UDT, CLOB‑based UDT, Period, or Geospatial columns in a secondary index column set.

You cannot include the system‑derived PARTITION or PARTITION#Ln columns in any secondary index definition.

You can include row-level security columns in a secondary index definition.

You can add secondary indexes to an existing table using the CREATE INDEX statement (see “CREATE INDEX” on page 334).

When a non‑primary index uniqueness constraint is created for a table, Teradata Database implements it internally as a USI.

As a general guideline for decision support applications, whenever you define the primary index for a multiset table to be a NUPI, you should consider defining one of the following uniqueness constraints on its primary key or other alternate key to facilitate row access and joins.

  • Unique secondary index
  • UNIQUE NOT NULL constraint
  • PRIMARY KEY constraint
  • You should always consider adding such constraints to your tables when they facilitate row access or joins. All manner of database constraints are often useful for query optimization, and the richer the constraint set specified for a database, the more opportunities exist to enhance query optimization.

    The likely benefits of adding uniqueness constraints are not restricted to multiset NUPI tables. It is also true that if you create a SET NUPI table, the system performs duplicate row checks by default unless you place a uniqueness constraint on the table. Unique constraint enforcement is often less a less costly method of enforcing row uniqueness than system duplicate row checks.

    Avoid defining a uniqueness constraint on the primary or alternate key of a multiset NUPI table just to enforce row uniqueness because MultiLoad and FastLoad do not support target tables with uniqueness constraints except those on the primary index. You can avoid the MultiLoad and FastLoad problems associated with indexes and triggers by using one of them (generally FastLoad) to load rows into a staging table that is otherwise identical with the target table, but has no constraints, triggers, or indexes defined on it. After loading the data into that table, you can then use either INSERT … SELECT or MERGE with error logging to move it into the target table that has the desired constraints and indexes defined on it. See “CREATE ERROR TABLE” on page 228 and SQL Data Manipulation Language for details.

    Furthermore, USIs impose a performance cost because their index subtables must be maintained by the system each time the column set in the base table they reference is updated.

    ALWAYS GENERATED … NO CYCLE identity columns can be a better choice for the task of enforcing row uniqueness in multiset NUPI tables because they are supported by both MultiLoad and FastLoad. However, identity columns cannot be used to facilitate row access or joins.

    You can also decide not to allow the system to enforce row uniqueness. Instead, you can use either of the following approaches.

  • Enforce row uniqueness with application code.
  • Avoid excluding duplicate rows by any arbitrary method, then perform periodic validation queries against your tables to detect duplicate rows.
  • This method assumes 2 things.

  • Your application code tolerates duplicate rows.
  • You know how to identify duplicate rows and how to eliminate any that are found in your tables.
  • There are several reasons to advise against approaches to duplicate row management that avoid using declarative constraints in order not to guarantee database integrity.

    For example, if you define any non‑primary index uniqueness constraints on a table, you must drop them all before you use MultiLoad or FastLoad to load rows into that table, then recreate them after the load operation completes.

    If the MultiLoad or FastLoad operation loads any duplicate rows into the table, then you cannot recreate a uniqueness constraint on the table. You must first detect the duplicates and then remove them from the table.

    Application‑based methods of duplicate row management make several assumptions that are difficult to enforce. These assumptions include.

  • The application code used to detect and reject duplicate rows is implemented identically across all applications in the enterprise.
  • The application code used to detect and reject duplicate rows, even if universally implemented, is correct for all situations.
  • All updates to the database are made by means of those applications. This assumption neglects the possibility of ad hoc interactive SQL updates that bypass the application‑based duplicate row detection and rejection code.
  • USIs are generally not recommended for tables in workloads that support OLTP applications.

    Each secondary index has an associated secondary index subtable. Each row in a secondary index subtable contains the indexed column value and 1 or more row IDs that point to the base table data rows containing that value (see Database Design for details of USI and NUPI subtable row layouts).

     

    Rows for this type of secondary index …

    Are stored in a subtable on …

    USI

    a different AMP from the rows they reference.

    This is generally, but not universally, true. It is possible, though unlikely, for a USI row to be stored on the same AMP as the base table row it references. For performance planning, it is best to think of secondary index access as always being a 2‑AMP operation.

    An exception to this is the case where a USI is defined on the same column set as a partitioned primary index, but not all of the partitioning columns are in the primary index column set.

    NUSI

    the same AMP as the rows they reference.

    Secondary indexes can greatly reduce table search time for so‑called set selections (this term, which refers to selecting multiple rows, is technically a misnomer because a set can also contain 0 or 1 elements). However, update operations (including delete, insert, and update) on tables with secondary indexes require more I/O to process because of the required transient journaling, the possibility that secondary index subtables must be updated (if the indexed column set was updated), fallback table updates, and so on.

    USI subtables are hashed and are very efficient for selecting a single row or a small number of rows.

    NUSI bit mapping provides efficient retrieval when complex conditional expressions are applied to very large tables. Several weakly selective NUSIs can be overlapped and bit mapped to achieve greater selectivity.

    The ALL option for NUSIs ignores the assigned case specificity for a column. This property enables a NUSI to do the following.

  • Include case specific values.
  • Cover a table on a NOT CASESPECIFIC column set.
  • Be aware that specifying the ALL option might also require additional index storage space.

    You cannot specify multiple NUSIs that differ only by the presence or absence of the ALL option.

    You should always test the performance effects of a secondary index before implementing it.

    See Database Design for additional details about secondary indexes.