15.00 - Workload Characteristics and Joins - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Workload Characteristics and Joins

Consider the following points when you write join queries against a partitioned table:

  • Specify equijoins on the primary index and partitioning column sets, if possible, in order to prejudice the Optimizer to use efficient RowKey‑based joins (see SQL Request and Transaction Processing).
  • Consider including the partitioning column in the nonpartitioned primary index table so you can join on the partition column. This means that, depending on the situation, you might want to consider denormalizing the physical schema to enhance the performance of partitioned table‑to‑nonpartitioned table joins.

  • If you specify an equijoin on the primary index column set, but not on the partitioning column set, the fewer combined partitions that exist after any row partition elimination, the better.
  • Otherwise, the table might need to be spooled and sorted.

    The Optimizer can specify sliding‑window joins when there are a small number of participating combined row partitions (see SQL Request and Transaction Processing).

  • Use RANGE_N to define fewer partitions and specify conditions on the row partitioning columns to reduce the number of combined row partitions involved in the join by evoking partition elimination.
  • To ensure that the Optimizer creates good query plans for your partitioned tables, you should always collect PARTITION statistics and keep them current.

    If you have not collected PARTITION statistics, the Optimizer does not know whether a combined row partition is empty or not, so it has to assume all defined combined row partitions might have rows with respect to the plan it generates; however, it might choose among several such plans based on the estimated number of populated combined row partitions.

  • Dynamic row partition elimination for a product join improves performance when a partitioned table and another table are equijoined on the row partitioning column of the partitioned table (see SQL Request and Transaction Processing).
  • Remember to collect statistics (see “COLLECT STATISTICS (Optimizer Form)” in SQL Data Definition Language) on all of the following.

  • The primary indexes of both tables.
  • The partitioning columns of the partitioned table.
  • The column in the nonpartitioned table that is equated to the partitioning column of the partitioned table.
  • The system-derived PARTITION column of all partitioned tables (see “PARTITION Columns” on page 801).
  • The recommended practice for recollecting statistics is to set appropriate thresholds for recollection using the THRESHOLD options of the COLLECT STATISTICS statement. See “COLLECT STATISTICS in SQL Data Definition Language for details on how to do this.