15.00 - Maintenance Costs of Join Indexes - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Maintenance Costs of Join Indexes

Join indexes can be expensive to maintain. For many customers, the most important factor in the decision to use a join index is likely to be how much it costs to maintain.

Each time a join-indexed base table column is updated, the corresponding join index table column must also be updated. Each time a new row is added to or an existing row is deleted from a join-indexed base table, the corresponding join index table rows must also be inserted or deleted.

Because of this maintenance overhead, you should always carefully evaluate the benefit of a join index created to enhance your standard queries vis-a-vis its cost to create and maintain (see “Cost/Benefit Analysis for Join Indexes” on page 534).

Maintenance Cost Optimizations Based on Foreign Key‑Primary Key Joins

You should consider the added cost of join index maintenance carefully when you are designing the indexes for your data warehouse to ensure that the minimum number of join indexes can be called upon by the Optimizer to cover the maximum number of queries. Designing with foreign key‑primary key joins (see “Restriction on Coverage by Join Indexes When a Join Index Definition References More Tables Than a Query” on page 577) allows you make these optimizations.

Whenever a base table column set that is shared with a join index is updated or deleted, or when a new row is inserted into the base table, the system generates extra steps to maintain the base table and join index concurrently. If the base table is specified as part of an outer join in the join index definition, the steps can be more complex because maintenance might be needed for both matched and unmatched row sets.

However, when the join columns have a foreign key‑primary key relationship, the system treats inner and outer joins alike (see “Restriction on Coverage by Join Indexes When a Join Index Definition References More Tables Than a Query” on page 577).

Maintenance Cost Optimizations for DELETE ALL Operations

A fastpath optimization is one that can be performed faster if certain conditions are met. For example, in some circumstances DELETE and INSERT operations can be performed faster if they can avoid reading the data blocks and avoid transient journaling.

Teradata Database uses both fastpath and deferred fastpath row partition DELETE operations for the following cases:

  • Deferred row partition deletion on a row-partitioned base table when a join index defined on the base table is not row-partitioned
  • Deferred partition deletion on a row-partitioned join index that is defined on a table
  • Deferred partition deletion on both a row-partitioned join index and its row-partitioned base table
  • In this case, Teradata Database performs the fastpath row partition deletion operations on the join index and the base table independently.

    Teradata Database can perform fastpath DELETE ALL operations, but not deferred row partition deletion operations, for the following cases:

  • If the deleted table has a a conditional DELETE with predicates and it covers the entire join index, the join index is eligible for a fast path DELETE.
  • All single‑table join indexes.
  • A multitable join index when the join between the tables is either an inner join or the table being deleted is the outer table in the join.
  • An implicit transaction with a single‑statement DELETE ALL table_name when the table has a join index defined on it.
  • An implicit transaction with a multistatement request.
  • An ANSI/ISO session mode transaction with a multistatement request.
  • A Teradata session mode transaction with a multistatement request.
  • Types of Join Index Examined for This Analysis

    Different types of join indexes incur different costs of maintenance. For the analyses provided here, several different types of join index were used. The types are far from exhaustive, but they provide a fairly representative range of data that you can use to extrapolate roughly how much benefit a join index that uses a particular type of join is likely to provide.

    The following table provides a list of the various types of joins defined for the join indexes used in this study. When you think of types of joins, you probably think of join processing types like merge join, nested join, product join, and so on.

    The types of joins defined here (in-place, foreign, and ad hoc) are orthogonal to those join processing types. The types of joins and the types of join indexes go hand in hand. In both cases, it is the number of tables that are being redistributed that determines the type of join or join index being described.

    Each join type can be used with either a simple or an aggregate join index.

     

    Join Type

    Definition

    In-place

    The joined tables have a common primary index and are joined on that column set.

    A common example is a logical entity-subentity relationship as seen in an Employee - Employee_Phone relationship, where both tables have Employee Number as a common primary index.

    As a result, rows to be joined are always on a common AMP and do not have to be redistributed.

    This is the least expensive join of the three types examined.

    Foreign

    The tables are joined on a primary key - foreign key column set relationship, where the primary key column set is also the primary index column set. The primary index of the other join table is typically a foreign key in the first table.

    A common example is a join between an Employee table and a Department table, where the join is on the common Employee Number column, which is the primary index for the Employee table, but is only a foreign key in the Department table.

    As a result, the rows to be joined must be redistributed from the AMP having the Department table rows to the AMP having the Employee table rows.

    This join is more expensive than the homogeneous primary index join, but less expensive than the ad hoc join.

    Ad hoc

    The tables are joined on a column set that is not a primary index in either table.

    As a result, rows from both tables must be redistributed to make the join.

    This join is the most expensive of the three types examined.