15.00 - Collecting Statistics on Hash and Join Indexes - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

Teradata Database
Programming Reference

Collecting Statistics on Hash and Join Indexes

Single- and multitable join indexes are fundamentally different database objects in many ways, and they are used for different purposes.

Statistics for base tables and multitable join indexes are not interchangeable, though a base table and an underlying non‑sparse single‑table join index can inherit statistics from one another (search the term bidirectional inheritance in SQL Request and Transaction Processing for details). If a given column is defined for both a base table and a multitable join index, you must collect statistics for each object on which it is defined. The statistics for column values stored in a base table can be very different from those for the same column stored in a derived join result in a multitable join index.

If you do not explicitly collect statistics for a multitable join index, the Optimizer does not attempt to use the statistics collected for its underlying base tables in an effort to derive statistics for the join index.

It is often useful to think of a multitable join index table as a special base table that represents a derived or pre-join. For example, execution plans that involve a join index must access it using a full table scan or an indexed scan. Statistics might need to be collected on search condition keys to assist the Optimizer to evaluate these alternative access paths.

Because execution plans might involve joining a join index with yet another table that is not part of the join index, it is often useful to collect statistics on these join columns to assist the Optimizer in estimating cardinality.

At minimum, you should collect statistics on the primary index of the join index. This provides the Optimizer with baseline statistics including the cardinality of the join index.

You should also collect statistics on secondary indexes defined on any join indexes. This helps the Optimizer to evaluate alternate access paths when scanning a join index.

You might also want to consider collecting statistics on any additional join index columns that frequently appear in WHERE clause search conditions, especially if the column is the sort key for a value-ordered join index.

There is little value in collecting statistics for joining columns that are specified in the join index definition itself. Instead, you should collect statistics on their underlying base tables.

This action also improves the performance of creating and maintaining join indexes, particularly during updates. Guidelines for doing this are identical to those for collecting statistics on the tables accessed for any regular join query.