Nonunique Secondary Indexes - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
kko1591750222108.ditamap
dita:ditavalPath
kko1591750222108.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantage™

Nonunique secondary indexes are typically assigned to nonunique column sets that frequently appear in WHERE clause selection conditions, join conditions, ORDER BY and GROUP BY clauses, foreign keys, and miscellaneous other conditions such as UNION, DISTINCT, and any attribute that is frequently sorted.

Highly selective NUSIs are useful for reducing the cost of frequently made selections and joins on nonunique columns, and provide extremely fast access for equality conditions. This is particularly true for NoPI tables (see NoPI Tables, Column-Partitioned NoPI Tables, and Column-Partitioned NoPI Join Indexes), where the only other access method might be a full-table scan. Note that NUSIs with low selectivity can be less efficient than a full-table scan.

NUSIs are also useful for range access and in-list conditions and for geospatial indexes.

Also note the following about NUSIs:

  • NUSI access is always an all-AMPs operation unless the index is defined on the same columns as the primary index. This is allowed when the NUSI is value-ordered or when the table or join index is partitioned and not all the partitioning columns are included in the primary index.
  • The subtables must be scanned in order to locate the relevant pointers to base table rows. This is a fast lookup process when a NUSI is specified in an equality or range condition because the NUSI rows are either hash-ordered or value-ordered on each AMP.
  • NUSI subtables are not covered by the active read fallback feature (see Physical Database Integrity for details).

Relationship Between a NUSI Subtable Row and Base Table Rows

A particular NUSI subtable row points to one or many base table rows on that same AMP. The relationship between a NUSI value and any individual AMP in a configuration is either 0:1, 1:1, or 1:M.

This relationship … Reflects the fact that …
0:1 an AMP contains no NUSI subtable index rows for a particular NUSI value.
1:1 an AMPs contains 1 NUSI subtable index row for a particular NUSI value.
1:M an AMP contains more than 1 NUSI subtable index row for a particular NUSI value.

NUSI Access and Performance

NUSI requests are all-AMP requests unless the NUSI is defined on the same columns as a primary index or primary AMP index.

The usefulness of a NUSI is correlated with the number of rows per value: the higher number of rows per value, the less useful the index. If the number of rows for a NUSI value exceeds the number of data blocks in the table, the usefulness of the NUSI might be questionable. On the other hand, as NUSI values approach uniqueness (meaning that the number of rows per value is either close to 1 or is significantly less than the number of AMPs in the system), an all-AMPs table access is wasteful and you should consider defining a join index (see Join and Hash Indexes) to support DML requests against the table instead of a NUSI.

Because NUSI access is usually an all-AMPs operation, NUSIs may seem to have limited value. If you have to access all AMPs in the configuration to locate the requested rows, why bother with an index?

The answer to that question is provided by the following list of ways that NUSIs can improve the performance of your decision support queries:
  • NUSI access is often faster than a full-table scan, particularly for extremely large tables. A full-table scan is also an all-AMP operation.
  • A NUSI that covers (see NUSIs and Query Covering for a definition of covering) the columns requested by a query is often included in Optimizer access plans.
  • A NUSI that covers a LIKE expression or any selective inequality conditions is often included in Optimizer access plans.
  • A NUSI on the same columns as the primary index (this is only allowed when the primary index does not include all the columns of all the partitioning columns) may be more efficient than accessing using the primary index when there is no or limited partition elimination for a query.

While NUSI access is usually an all-AMPs operation, keep in mind that the AMPs work in parallel. If all the AMPs have qualified rows, then this is a very efficient operation. If some or many of the AMPs do not have qualified rows, then those AMPs are doing work just to determine that they have no qualified rows. Note that if there are more rows per NUSI value than AMPs, it is likely that every AMP will have one or more qualified rows.

Depending on demographics and environmental cost variables, the Optimizer will specify a full-table scan instead of a NUSI access when it determines that the scan would be a more efficient access method.

Selectivity Considerations

Selectivity refers to the percentage of rows in a table containing the same nonunique secondary index value. An index that has high selectivity retrieves few rows. A unique primary index retrieval, for example, is highly selective because it never returns more than one row. An index that has low selectivity retrieves many rows.

Low Selectivity Indexes

When an index is said to have low selectivity, that means that many rows have the same NUSI value and there are relatively few distinct values.

A column with those characteristics is usually a poor choice for a NUSI because the cost of using it may be as high or higher than a full-table scan.

For example, assume that employee table contains 10,000 rows of about 100 bytes each, and there are only 10 different departments. If an average employee table data block is 2,560 bytes and can store about 25 rows, then the entire table requires about 400 data blocks.

If dept_no is defined as a nonunique secondary index on the employee table, and the dept_no values are evenly distributed, then the following query accesses about 1,000 row selections.

     SELECT *
     FROM employee
     WHERE dept_no = 300;

Each AMP reads its own rows of the dept_no secondary index subtable. If any rows contain the index value 300, the AMP uses the associated rowIDs to select the data rows from its portion of the employee table.

Regardless of the number of AMPs involved, this retrieval requires 1,000 row selections from the employee table. To satisfy this number of select operations, it is likely that all 400 employee table data blocks would have to be read.

If that were the case, then the number of I/O operations undertaken by the retrieval could easily exceed the number required for a full-table scan. In such instances, a table scan would actually be a much more efficient solution than a NUSI-based retrieval.

High Selectivity Index

If deptno is a high selectivity index, where few employee rows share the same deptno value, then using deptno for retrieval provides better performance than a full-table scan. Because of these selective conditions, the Optimizer specifies NUSI access for request processing when it is less costly than a full-table scan.