15.00 - Low Selectivity Indexes - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Low Selectivity Indexes

When an index is said to have low selectivity, that means that many rows have the same NUSI value and there are relatively few distinct values.

A column with those characteristics is usually a poor choice for a NUSI because the cost of using it may be as high or higher than a full‑table scan.

For example, assume that employee table contains 10,000 rows of about 100 bytes each, and there are only 10 different departments. If an average employee data block is 2,560 bytes and can store about 25 rows, then the entire table requires about 400 data blocks.

If dept_no is defined as a non-unique secondary index on the employee table, and the dept_no values are evenly distributed, then the following query accesses about 1,000 row selections.

     SELECT * 
     FROM employee 
     WHERE dept_no = 300;

Each AMP reads its own rows of the dept_no secondary index subtable. If any rows contain the index value 300, the AMP uses the associated rowIDs to select the data rows from its portion of the employee table.

Regardless of the number of AMPs involved, this retrieval requires 1,000 row selections from the employee table. To satisfy this number of select operations, it is likely that all 400 employee data blocks would have to be read.

If that were the case, then the number of I/O operations undertaken by the retrieval could easily exceed the number required for a full‑table scan. In such instances, a table scan would actually be a much more efficient solution than a NUSI-based retrieval.