15.00 - Indexes and Partitioning - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Indexes and Partitioning

Typically, retrieving or updating data involves a relatively small number of rows. Without indexes and partitions, Teradata Database would need to scan every row in the table to find the rows of interest. Indexes and partitions can speed data access by providing an alternative, more direct path to the data of interest. In some cases, a query can be fulfilled even more quickly by accessing the index alone, without requiring access to the indexed table at all.

Because CPUs are becoming faster at a far greater rate than disk storage access rates, appropriate indexing and partitioning can be an important factor in altering the CPU‑to‑I/O ratio. In fact, it is sometimes possible to render previously I/O‑bound systems CPU‑bound or nearly CPU‑bound through optimal indexing and partitioning of database tables.

The Teradata Database parallel architecture makes indexing and partitioning an aid to better performance, not a crutch necessary to ensure adequate performance. Full‑table scans are not something to be feared in the Teradata Database environment. This means that unplanned, ad hoc queries that characterize the data warehouse process, and that often are not supported by indexes, perform very effectively for Teradata Database using full‑table scans.

The classic index for a relational database is itself a file made up of rows having two parts.

  • A (possibly unique) column in the referenced table.
  • A pointer to the location of that row in the base table (if the index is unique) or pointers to all possible locations of the rows (if the index is nonunique).
  • Because Teradata Database is a massively parallel architecture, it requires an efficient method of distributing and retrieving its data. For all database objects except NoPI and column partitioned objects, that method is hashing. All Teradata Database indexes are based, at least partially, on the rowhash of column values rather than directly on the column values. Rows for a table with a primary index and unique secondary indexes are stored by an AMP in hash order. Nonunique secondary, join, and hash indexes can be stored by an AMP in hash order or in value order of a column. Value order may make an index more useful for satisfying range conditions.