Miscellaneous Considerations for Star Join Optimization - Analytics Database - Teradata Vantage

SQL Request and Transaction Processing

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
zfm1628111633230.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
evd1472255317510
lifecycle
latest
Product Category
Teradata Vantageā„¢

Star Join Planning and Statistics

Join planning is based on the estimated cardinalities of the results tables. The cardinalities typically cannot be precisely estimated without accurate statistics.

The cardinality of a star join is estimated based on the cardinality of the small table join result and the selectivity of the collection of large table join columns. Therefore, to guarantee a good join plan for queries involving star joins, the following usage considerations apply:
  • Statistics must be collected for all the tables on their primary [AMP] indexes, as well as for each index used in the query.
  • If constraints are specified on unindexed columns, statistics must be collected on these columns as well.

Avoiding Hash Synonyms for Star Joins

Depending on the columns making up the primary index, hash synonyms may occur. Hash synonyms, which typically occur when the primary index is composed of only small integer columns, always degrade query performance.

Changing Data Types to Enhance Performance

If possible, design your tables and queries so that joined columns are from the same domain (of the same data type), and if numeric, of the same size. If the joined columns are of different data types (and different sizes, if numeric), changing the type definition of one of the tables typically improves join performance.

If there is no join condition specified on any index, neither table need be changed. If the same data types are specified on the joined columns, the primary index may be used for an intermediate join result, eliminating the need for rehashing.

However, if an index can be used in the join, and if columns of the index are smaller, one table may have to be changed. Changing the smaller table to define its join columns using the data type of the larger table typically improves performance.

For example, consider the following join condition, assuming that table_1.NUPI is typed SMALLINT and table_2.NUPI is typed INTEGER.

    table_1.NUPI = table_2.NUPI

If table_1 is the larger table, consider changing table_2.NUPI to type SMALLINT. However, if table_2 is the larger table, consider changing table_1.NUPI to type INTEGER.

Changing Conditional Expressions to Use One Index Operand

If one side of a join condition combines expressions and indexing, performance is generally not as good as if just the index is an operand. Consider modifying the equality to isolate the index on one side, exclusive of any expressions.

For example, consider the following conditional expressions.

The condition is stated in the following example using the primary index of table_2, which is table_2.NUPI, in an expression:

   table_1.x = table_2.NUPI - 1

This is a suboptimal way to specify the condition.

The following specification of the condition, which is optimal, moves the primary index of table_2 from the expression to the other side of the equality condition, using simple algebraic addition of the same term (+1) to both sides of the equation.

   table_1.x + 1 = table_2.NUPI