17.10 - Collecting Statistics on Hash and Join Indexes - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL Data Definition Language Detailed Topics

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B035-1184-171K
Language
English (United States)

Single- and multitable join indexes are fundamentally different database objects in many ways, and they are used for different purposes.

Statistics for base tables and multitable join indexes are not interchangeable, though a base table and an underlying non-sparse single-table join index can inherit statistics from one another. See the description of bidirectional inheritance in Teradata Vantage™ - SQL Request and Transaction Processing, B035-1142. If a column is defined for a base table and a multitable join index, you must collect statistics for each object on which it is defined. The statistics for column values stored in a base table can be very different from those for the same column stored in a derived join result in a multitable join index.

If you do not explicitly collect statistics for a multitable join index, the Optimizer does not attempt to use the statistics collected for its underlying base tables in an effort to derive statistics for the join index.

A multitable join index table can be thought of as a special base table that represents a derived or pre-join table. For example, execution plans that involve a join index must access it using a full table scan or an indexed scan. Statistics might need to be collected on search condition keys to assist the Optimizer in evaluating these alternative access paths.

Because execution plans might involve joining a join index with yet another table that is not part of the join index, it is often useful to collect statistics on these join columns to assist the Optimizer in estimating cardinality.

Minimally, you should collect statistics on the primary index or primary AMP index of the join index to provide the Optimizer with baseline statistics including the cardinality of the join index.

You should also collect statistics on secondary indexes defined on any join indexes. This helps the Optimizer to evaluate alternate access paths when scanning a join index.

You might also want to consider collecting statistics on any additional join index columns that frequently appear in WHERE clause search conditions, especially if the column is the sort key for a value-ordered join index.

There is little value in collecting statistics for joining columns that are specified in the join index definition itself. Instead, you should collect statistics on their underlying base tables.

This action also improves the performance of creating and maintaining join indexes, particularly during updates. Guidelines for doing this are identical to those for collecting statistics on the tables accessed for any regular join query.