Maintenance Costs of Join Indexes
Join indexes can be expensive to maintain. For many customers, the most important factor in the decision to use a join index is likely to be how much it costs to maintain.
Each time a join-indexed base table column is updated, the corresponding join index table column must also be updated. Each time a new row is added to or an existing row is deleted from a join-indexed base table, the corresponding join index table rows must also be inserted or deleted.
Because of this maintenance overhead, you should always carefully evaluate the benefit of a join index created to enhance your standard queries vis-a-vis its cost to create and maintain (see “Cost/Benefit Analysis for Join Indexes” on page 534).
Maintenance Cost Optimizations Based on Foreign Key‑Primary Key Joins
You should consider the added cost of join index maintenance carefully when you are designing the indexes for your data warehouse to ensure that the minimum number of join indexes can be called upon by the Optimizer to cover the maximum number of queries. Designing with foreign key‑primary key joins (see “Restriction on Coverage by Join Indexes When a Join Index Definition References More Tables Than a Query” on page 577) allows you make these optimizations.
Whenever a base table column set that is shared with a join index is updated or deleted, or when a new row is inserted into the base table, the system generates extra steps to maintain the base table and join index concurrently. If the base table is specified as part of an outer join in the join index definition, the steps can be more complex because maintenance might be needed for both matched and unmatched row sets.
However, when the join columns have a foreign key‑primary key relationship, the system treats inner and outer joins alike (see “Restriction on Coverage by Join Indexes When a Join Index Definition References More Tables Than a Query” on page 577).
Maintenance Cost Optimizations for DELETE ALL Operations
A fastpath optimization is one that can be performed faster if certain conditions are met. For example, in some circumstances DELETE and INSERT operations can be performed faster if they can avoid reading the data blocks and avoid transient journaling.
Teradata Database uses both fastpath and deferred fastpath row partition DELETE operations for the following cases:
In this case, Teradata Database performs the fastpath row partition deletion operations on the join index and the base table independently.
Teradata Database can perform fastpath DELETE ALL operations, but not deferred row partition deletion operations, for the following cases:
Types of Join Index Examined for This Analysis
Different types of join indexes incur different costs of maintenance. For the analyses provided here, several different types of join index were used. The types are far from exhaustive, but they provide a fairly representative range of data that you can use to extrapolate roughly how much benefit a join index that uses a particular type of join is likely to provide.
The following table provides a list of the various types of joins defined for the join indexes used in this study. When you think of types of joins, you probably think of join processing types like merge join, nested join, product join, and so on.
The types of joins defined here (in-place, foreign, and ad hoc) are orthogonal to those join processing types. The types of joins and the types of join indexes go hand in hand. In both cases, it is the number of tables that are being redistributed that determines the type of join or join index being described.
Each join type can be used with either a simple or an aggregate join index.
The joined tables have a common primary index and are joined on that column set.
A common example is a logical entity-subentity relationship as seen in an Employee - Employee_Phone relationship, where both tables have Employee Number as a common primary index.
As a result, rows to be joined are always on a common AMP and do not have to be redistributed.
This is the least expensive join of the three types examined.
The tables are joined on a primary key - foreign key column set relationship, where the primary key column set is also the primary index column set. The primary index of the other join table is typically a foreign key in the first table.
A common example is a join between an Employee table and a Department table, where the join is on the common Employee Number column, which is the primary index for the Employee table, but is only a foreign key in the Department table.
As a result, the rows to be joined must be redistributed from the AMP having the Department table rows to the AMP having the Employee table rows.
This join is more expensive than the homogeneous primary index join, but less expensive than the ad hoc join.
The tables are joined on a column set that is not a primary index in either table.
As a result, rows from both tables must be redistributed to make the join.
This join is the most expensive of the three types examined.