15.00 - When You Should Consider Defining a Join Index - Teradata Database

Teradata Database Design

Teradata Database
User Guide

When You Should Consider Defining a Join Index

Join indexes are not suited for all applications and situations. The usefulness of a join index, like that of any other index, depends on the type of work it is designed to perform. Always prototype any join index and evaluate its usefulness to the applications it is designed to support before adding it to your production environment. The overhead of updating join index tables can outweigh their benefit in some situations.

The following situations all make a join index a likely performance enhancer:

  • Frequent joins of large tables with other large or moderately-sized tables that result in a significant number of the rows from both tables being joined.
  • Frequent joins of tables of high degree (having many columns) for which the same relatively small set of columns is repeatedly requested.
  • An alternate partitioning sequence for a vertical subset of data in one of the base tables (a so-called single-table join index) would remove the necessity of redistributing rows for a frequently made join.
  • The overhead in time and storage capacity for the creation and maintenance of a join index does not outweigh its retrieval benefits.
  • The performance of frequent range queries requiring joins of large tables with other large or moderately‑sized tables that result in a significant number of the rows from both tables being joined.
  • A row-partitioned join index can enhance the performance of queries if you specify an equality or range constraint on the partitioning column set. For example, a single-table row-partitioned join index can take advantage of row partition elimination to improve both the performance of a query retrieving rows from itself.
  • Be aware that you cannot define row partitioning for a row-compressed join index.

  • If a frequently run query specifies a complex expression in its predicate, consider creating a single‑table join index or a hash index on the table that includes that expression in the select list or column list, respectively, of its definition. Although you cannot collect statistics on complex base table expressions, creating a single‑table join using the expression transforms it into a simple column, and you can then collect statistics on that column. The Optimizer can then use those statistics to estimate the single‑table cardinality of evaluations of the expression in a query predicate that specifies the expression using a base table column. See SQL Request and Transaction Processing for more information.
  • Most queries against a column-partitioned table or join index are expected to be very selective on a variable subset of columns, and project a variable subset of the columns where the subset of accessed columns is less than 10% of the column partitions for any particular query.
  • Sometimes you just need to experiment.

    For example, application of a row-partitioned join index might be for queries that involve row-partitioned base tables. However, if the base table is not a row-partitioned table, but is designed to handle efficient joins on the primary index, it is also conceivable that a row-partitioned join index might be defined to provide an alternative organization of the data for optimal access based on row partitions. This is only valid if the join index is not row-compressed. Partitioning is not valid for row-compressed join indexes.

    See SQL Data Definition Language for further information about PPI join indexes.