16.10 - Selecting Indexes for Star Joins - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
16.10
created_date
June 2017
category
Programming Reference
User Guide
featnum
B035-1142-161K

This topic provides guidelines for selecting indexes for use in star joins. These are rules of thumb only—a more complete performance model is required in order to select indexes that optimize the entire mix of join queries on the table.

Create Indexes on Join Columns for Each Star Join

The performance of a star join can be greatly improved if you create an index on the collection of some join columns of the large table so that redistributing and sorting of the large table are avoided. If a large table is involved in more than one star join, you should create an index on the collection of some join columns associated with each star join.

For example, if the large table Widgets is joined with the small tables color, shape, and size using the collection of columns (color, shape, size) in one star join, and with the small tables period, state, and country using the collection of columns (period, state, country) in another star join, then you can create the following indexes on Widgets to be used in those star joins:

  • Primary index on (color, shape, size).
  • Nonunique secondary index on (period, state, country).

Star Join Criteria for Selecting an Index Type

You must decide the type of index that is to be created on each collection of join columns of a large table. When making that decision, consider the following guidelines:

  • A primary index is the best index for star joins.
  • Each table can have only one primary index.
  • The Optimizer does not use USIs and NUSIs for star joins when the estimated number of rows to be retrieved is large.

    Applications of NUSIs and USIs for star joins are limited, so always verify that when an index is created on a large table, it will be used by the Optimizer.

    If a NUSI or USI is used, the rowIDs are retrieved via a nested join, after which a rowID join is used to retrieve the data rows. Note that rowID joins are sometimes very ineffective.

Star Join Criteria When Any Join Column Can Be the Primary Index

If any of the collections of join columns meets the criteria for a good candidate primary index (that is, has enough unique values to guarantee that the large table is distributed evenly across the AMPs), then you should consider the following guidelines:

IF LT/ST joins are … THEN the primary index should be created using the …
used with the same frequency star join that results in the most number of rows selected.

This leaves the star joins that select fewer rows for the NUSIs and USIs.

not used with the same frequency collection of join columns associated with the most often used star join.

For example, if the star join between Widgets and (period, state, country) is used more often than the star join between Widgets and (color, shape, size), the primary index should be created on (period, state, country).

However, if the former join selects far fewer number of rows than the latter join, it may be better to associate the primary index with the latter join (on columns color, shape, and size).

Performance Modeling: Optimizing All Join Queries

To optimize the entire mix of join queries, you should design a more complete performance model for your database.

For example, a user might have a relatively short star join query that is used more frequently than an extremely long query.

In such cases, it might be better to select the primary index favoring the long query, even though the guidelines indicate otherwise. This is because the benefit of the long query may be very great compared to the cost of the short query, and the combination of joins results in a net gain in performance.

Not all join columns of the small tables must join with the large table index in an LT/ST-J1a (Reasonable Indexed) star join.

Using a Common Set of Star Join Columns in the Primary Index

If more than one combination of the large table columns is used in different star joins, and if the combinations are overlapping, then the primary index should consist of the common set of these combinations (if the set is qualified for the primary index).

This has the following advantages:

  • Fewer indexes are required.
  • More than one star join can share the same primary index.

For example, assume that the following conditions are true:

  • The collection of columns (color, shape, size) of the large table widgets is joined with the small tables (color, shape, and size) of a star join
  • The collection of columns (shape, size, and period) is joined with the small tables (shape, size, and period) of another star join

In this case, the primary index of widgets should be defined on the columns (shape, size) if that column set is qualified to be the primary index for widgets.