When assigning columns to be a secondary index for a table, there are numerous factors to consider, the most important of all being the selectivity of the index.
While USI retrievals are always very efficient, the efficiency of NUSI retrievals varies greatly depending on their selectivity.
Optimal Data Access
Selectivity is a relative term that refers to the number of rows returned by an index. Most retrievals aim to return only a select few rows: very specific answers in response to a very specific request.
An index that returns a small number of rows is said to be highly selective. This is a positive attribute.
Indexes that return a large number of rows are said to have low selectivity. This is generally a negative attribute; so negative that, as often as not, the Optimizer selects a full-table scan over a NUSI with low selectivity because the full-table scan can be less costly.
All UPIs and USIs are highly selective by definition, as are most well-chosen NUPIs. High selectivity is favored not only because of its precision, but also because of its low cost, involving a very small number of disk I/Os, which is always a performance-enhancing attribute.
Criteria for Selecting a Secondary Index
- Consider naming secondary indexes whenever possible using a standard naming convention.
- Avoid assigning secondary indexes to frequently updated column sets.
- Avoid assigning secondary indexes to columns with lumpy distributions because there is a slight chance the Optimizer might mistake their usefulness.
- Avoid creating excessive secondary indexes on a table, particularly for a table used heavily, or even moderately, for OLTP processing. The less frequently the table is updated, the more desirable a multiple index solution.
- Consider building secondary indexes on column sets frequently involved in the following clauses, predicates, and other logical operations:
- Selection criteria
- Join criteria
- ORDER BY clauses
- GROUP BY clauses
- Foreign keys (because of join and subquery processing)
- UNION, DISTINCT, and other sort operations
When these operations act on well-indexed column sets, the number of scans and sorts that must be performed on the data by the database manager can be greatly reduced.
- Consider creating USIs for tables without a UPI that require frequent single-row access.
- Consider creating NUSIs for tables that require frequent set selection access.
- Consider creating covering indexes when possible and cost effective (including considering the cost of maintaining the index). The Optimizer frequently selects covering indexes to substitute for a base table access whenever the overall cost of the query plan is reduced. Such index-only access promotes faster retrievals.
Alternatively, many applications are well served by join indexes, which can be used profitably in many covering situations where multiple columns are frequently joined. See Join and Hash Indexes for further information about join indexes.
- Consider creating secondary indexes on columns frequently operated on by built-in functions such as aggregates.
- Consider assigning a uniqueness constraint such as PRIMARY KEY, UNIQUE, or USI, as appropriate, to the primary or other alternate key of any table built with a NUPI. This both enforces uniqueness, eliminating the burden of making row uniqueness checks, and enhances retrieval for applications where the primary or other alternate key is frequently used as a selection or join criterion.
This guideline is situational and is contingent on a number of factors. The various factors involved in the recommendation are described in Using Unique Secondary Indexes to Enforce Row Uniqueness.
A primary or alternate key USI might not be a good decision for a table that is frequently updated by OLTP applications.
- Plan to dynamically drop and recreate secondary indexes to accommodate specific processing and performance requirements such as bulk data loading utilities, database archives, and so on.
Create appropriate macros to perform these drop and create index operations if you need to undertake such specific processing tasks regularly.
- Make sure that your indexes are being used as planned by submitting EXPLAIN request modifiers to audit index selection for those queries they are designed to facilitate.Secondary indexes that are never selected by the Optimizer are a burden to the system for the following reasons:
- They consume disk resources that could profitably be used to store data or indexes that are used.
- They degrade update processing performance unnecessarily.
Secondary Index Usage Summary
- Can enhance the speed of data retrieval.
- Do not affect base table data distribution.
- Maximum of 32 secondary, hash, and join indexes defined per table. Each composite NUSI that specifies an ORDER BY clause counts as 2 consecutive indexes in this calculation (see Importance of Consecutive Indexes for Value-Ordered NUSIs).
The limit of 32 indexes applies to any combination of secondary, hash, and join indexes defined on a table, ranging from 0 secondary indexes and 32 join indexes, 11 hash indexes, 11 join indexes, and 10 secondary indexes to 32 secondary indexes and 0 join indexes.
This includes the system-defined secondary indexes used to implement PRIMARY KEY and UNIQUE constraints.
- Can be composed of as many as 64 columns.
- Can include columns defined with a UDT data type.
- Cannot contain columns defined with XML, BLOB, CLOB, BLOB-based UDT, CLOB-based UDT, XML-based UDT, Period, or JSON data types.
- Cannot be defined on global temporary trace tables.
- Can be created or dropped dynamically as data usage changes or if they are found not to be useful for optimizing data retrieval performance.
- Require additional disk space to store subtables.
- Require additional I/Os on INSERTs, DELETEs, and possibly on UPDATEs and MERGEs.
- Should not be defined on columns whose values change frequently.
- Should not include columns that do not enhance selectivity.
- Should not use composite secondary indexes when multiple single-column indexes and bit mapping might be used instead.
- Composite secondary index is useful if it reduces the number of rows that must be accessed.
- Most efficient for selecting a small number of rows.
- NUSIs can be hash-ordered or value-ordered.
- Ordering for NUSIs defined with an ORDER BY clause is restricted to a single numeric or DATE column of 4 of fewer bytes.
- If they cover, or partially cover, a query, then they further improve their usefulness.
- Can be used to enforce row uniqueness for multiset NUPI and NoPI tables.
- Guarantee that each complete index value is unique.
- Any access is, at most, a two-AMP operation.
- Useful for locating rows having a specific value in the index.
- Can be hash-ordered or value-ordered.
Value-ordered NUSIs are particularly useful for enhancing the performance of range queries.
- Any access is an all-AMPs operation with the exception of the case where a NUSI is defined on the same column set as the primary index for the table.
- If an index is defined with an ORDER BY clause, it counts as 2 consecutive indexes against the table limit of 32 secondary, hash, and join indexes (see Importance of Consecutive Indexes for Value-Ordered NUSIs).
For more information about secondary indexes, see the information about CREATE INDEX and CREATE TABLE in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144.