Miscellaneous Considerations for Star Join Optimization - Advanced SQL Engine

Miscellaneous Considerations for Star Join Optimization - Advanced SQL Engine - Teradata Database

SQL Request and Transaction Processing

Product

Advanced SQL Engine

Teradata Database

Release Number

17.10

Published

July 2021

Language

English (United States)

Last Update

2021-07-28

dita:mapPath

uqf1592445067244.ditamap

dita:ditavalPath

uqf1592445067244.ditaval

dita:id

B035-1142

lifecycle

Product Category

Teradata Vantage™

Star Join Planning and Statistics

Join planning is based on the estimated cardinalities of the results tables. The cardinalities usually cannot be precisely estimated without accurate statistics.

The cardinality of a star join is estimated based on the cardinality of the small table join result and the selectivity of the collection of large table join columns. Therefore, to guarantee a good join plan for queries involving star joins, the following usage considerations apply:

Statistics must be collected for all the tables on their primary [AMP] indexes, as well as for each index used in the query.
If constraints are specified on unindexed columns, statistics must be collected on these columns as well.

Avoiding Hash Synonyms for Star Joins

Depending on the columns making up the primary index, hash synonyms might occur. Hash synonyms, which usually occur when the primary index is composed of only small integer columns, always degrade query performance.

Changing Data Types to Enhance Performance

If possible, design your tables and queries so that joined columns are from the same domain (of the same data type), and if numeric, of the same size. If the joined columns are of different data types (and different sizes, if numeric), changing the type definition of one of the tables should improve join performance.

If there is no join condition specified on any index, neither table need be changed. In such cases, if the same data types are specified on the joined columns, the primary index might be used for an intermediate join result, thus eliminating the need for rehashing.

If, however, an index can be used in the join, and if some columns of the index are of smaller size, then one of the tables might have to be changed. To improve performance, it is frequently better to change the smaller table to define its join columns using the data type of the larger table.

For example, consider the following join condition, assuming that table_1.NUPI is typed SMALLINT and table_2.NUPI is typed INTEGER.

    table_1.NUPI = table_2.NUPI

If table_1 is the larger table, you should consider changing table_2.NUPI to type SMALLINT. However, if table_2 is the larger table, consider changing table_1.NUPI to type INTEGER.

Changing Conditional Expressions to Use One Index Operand

If one side of a join condition combines expressions and indexing, performance is generally not as good as if just the index is an operand. Consider modifying the equality to isolate the index on one side, exclusive of any expressions.

For example, consider the following conditional expressions.

The condition is stated in the following example using the primary index of table_2, which is table_2.NUPI, in an expression:

   table_1.x = table_2.NUPI - 1

This is a suboptimal way to specify the condition.

The following specification of the condition is optimal because it separates the primary index of table_2 from the expression, moving it to the other side of the equality condition using simple algebraic addition of the same term (+1) to both sides of the equation.

   table_1.x + 1 = table_2.NUPI