15.00 - Join Access: Column Selection Guidelines - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Join Access: Column Selection Guidelines

When designing for join access, you should first consider selecting common join columns for use as primary indexes. Joining tables on their primary indexes permits the Optimizer to specify hash or merge joins, two highly effective techniques, in the join plan to further optimize a query.

The effectiveness of a merge join relates directly to whether it is made on primary indexes or not. Possible scenarios, presented in order of optimum performance from best to worst, are as follows:

 

                               Join Predicate

              Redistribution Action Required

Tables joined on their primary indexes (and partitioning columns, if there are any).

None.

Tables joined on their primary indexes, but not on all partitioning columns.

Direct join in some cases, otherwise redistribute and sort the rows to be joined.

One join column is a primary index; the other is not.

Qualified rows from one table must be redistributed and sorted.

Neither join column is a primary index.

Qualified rows from both tables must be redistributed and sorted.

For multicolumn joins, you should consider using all, or at least a subset, of the join columns as the primary index.

Analogously to designing for row access, the primary guideline for selecting a primary index for join access is to choose a column or, less preferably, a set of columns, that is most frequently used to make the join. In other words, define the primary indexes for the tables to be joined on a column set that is most frequently equated to discrete values in WHERE or ON clause predicates in your application environment.

If you define a primary index on too many columns, then those queries that do not specify all of the columns in the index are not assigned a hash or merge join by the Optimizer.

If you define a partitioned primary index with too large a number of populated partitions, join access performance can be degraded.