Using the BETWEEN Clause to Reduce Row Redistribution - Teradata Database

SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Language
English (United States)
Last Update
2018-09-24
dita:id
B035-1145
lifecycle
previous
Product Category
Teradata® Database

Using the BETWEEN Clause to Reduce Row Redistribution

When retrieving data for a time interval, you can use MIN and MAX dates to define a BETWEEN clause, which results in fewer row redistributions than other methods.

Given a reference calendar that contains 730 rows with:

  • calendar_date
  • fiscal_week
  • fiscal_month
  • fiscal_quarter
  • If you want summary data from a History table for fiscal_quarter, the standard query would be:

      SELECT H.item_code, SUM(H.items_sold), SUM(H.sales_revenue)
      FROM History H , Calendar C
      WHERE C.fiscal_quarter = '3Q06'
      AND   C.calendar_date = H.sale_date
      GROUP BY H.item_code
      ORDER BY H.item_code;

    The Optimizer processes the query as follows:

    1 Build a spool table with dates from the reference calendar (90 days).

    2 Duplicate the calendar spool. Either:

  • Product join the calendar spool with the History table (90 compares/history table row).
  • Sort both tables to do merge join.
  • Alternatively, redistribute the entire History table. Product join the large table with the calendar spool (~1 row /AMP).

    Another approach is to denormalize the History table. Add fiscal_week, fiscal_month, fiscal_quarter to the History table. Qualify fiscal_month directly in the denormalized table. The penalties for using this approach include:

  • Denormalization maintenance costs are higher.
  • Extra bytes require more I/Os.