Costing Considerations for Join Indexes - Teradata VantageCloud Lake

Lake - Database Reference

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ohi1683672393549.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
ohi1683672393549

Join indexes, like secondary indexes, incur both space and maintenance costs. For example, INSERT, UPDATE, and DELETE operations must be performed twice: once for the base table and once for the join index.

Space Costs for Join Indexes

The following formula estimates the space overhead required for a join index.

Join Index Size = U × (F + O + (R × A))

where:

Parameter Description
F Length of the fixed column join_index_column_1
R Length of a single repeating column join_index_column_2
A Average number of repeated fields for a given value in join_index_column_1
U Number of unique values in the specified join_index_column_1
O Row overhead (assume 14 bytes)

Updates to the base tables can cause a physical join index row to split into multiple rows. The newly formed rows each have the same fixed field value but contain a different list of repeated field values. This applies specifically when the compressed join index format is being used.

The system, however, does not automatically recombine logically related split rows. To re-compact such rows, you must drop and recreate the join index.

Maintenance Costs for Join Indexes

The use of a join index entails the following.
  • Initial time consumed to calculate and create the index
  • Whenever a value in a join index column of the base table is updated, the join index must also be updated, including any required aggregation or pre-join effort.

However, if join indexes are suited to your applications, the improvements in request performance can far outweigh the costs.

Join indexes are maintained by generating additional AMP steps in the base table update run plan. Those join indexes defined with outer joins typically require additional steps to maintain any unmatched rows.

Expect a single-table join index INSERT operation to have maintenance overhead similar to an insert operation with an equivalent NUSI. UPDATE or DELETE operations. However, the INSERT operation may incur greater overhead with a single-table join index unless a value for the primary index of the join index is available at the time of the update.

Overhead for an in-place aggregate join index can be 3 times more expensive than maintaining the same table without that index. For an aggregate join index that redistributes rows, the maintenance overhead can be even more expensive.

Maintenance overhead for multitable join indexes without aggregates depends on the prejoin effort involved in constructing or changing a join index row. This can be up to 20 times or more expensive than maintaining the table without the index. The overhead is greater at higher hits per block, where hits means the number of rows in a block are touched.

Because Vantage writes a block only once regardless of the number of rows modified, as the number of hits per block increases:
  • The CPU path per transaction decreases (faster for the case with no join index than for the case with a join index)
  • Maintenance overhead for aggregate join indexes decreases significantly

If a DELETE or UPDATE request specifies a search condition on the primary index or secondary index of a join index, the join index may be directly searched for the qualifying rows and modified accordingly.

This direct-update approach is used when the request adheres to these requirements:
  • A primary index or secondary index access path to the join index
  • If a join_index_column_2 is defined, little or no modification to the join_index_column_1 columns
  • No modifications to the join condition columns in the join index definition
  • No modifications to the primary index columns of the join index

You need not drop the join index before a backup. However, you must drop join indexes before the underlying tables and databases are restored, if a restore is ever required. Otherwise an error is reported and the restore is not done.