15.10 - Query Rewrite - Teradata Database

Teradata Database SQL Request and Transaction Processing

Teradata Database
Programming Reference
User Guide

Query Rewrite is the stage of query processing that occurs just prior to query optimization. Note that many query rewrites are done by the Optimizer, not by the Query Rewrite subsystem.

Query Rewrite involves applying the following processes to the parse tree wherever possible:

  • Substituting underlying base tables and covering indexes for views and derived tables (a method called view folding).
  • Substituting hash and join indexes for underlying base tables, views, or join operations.
  • Note that hash and single‑table join indexes are also used to estimate single‑table expression cardinalities (see “Using Join Index Statistics to Estimate Single‑Table Expression Cardinalities” on page 260).

  • Converting outer joins to inner joins.
  • Eliminating unnecessary joins altogether.
  • Using logical satisfiability and transitive closure either to eliminate terms or to add terms that facilitate further rewrites.
  • Simplifying predicates.
  • Pushing projections and predicate conditions into spooled views.
  • Eliminating set operation branches.
  • Pushing joins into UNION ALL views.
  • The Query Rewrite subsystem contains a Rewrite Driver (see “Query Rewrite Driver” on page 78), which understands and takes advantage of the interactions among various query rewrites to transform requests into better performing variants of themselves.

    The basic task of the Query Rewrite subsystem is to determine how query text can be rewritten to make it more efficient and more easily optimized without changing its semantics. In other words, Query Rewrite is the process of rewriting a query Q in a new form, query Q’, such that both of the following statements are true:

  • Both queries produce the identical result.
  • Q’ runs faster than Q.
  • Query Rewrite techniques can be rule‑based, such as predicate pushdown, or cost-based, such as rewriting a query to use a join index. Note that not all query rewrites are contained with the Query Rewrite subsystem. Some query rewrites, for example, can be applied in the query optimization phase of request parsing during join planning, or even after join planning is complete. For example, Partial GROUP BY and join order rewrites are done during join planning, and Common Spool Usage is a rewrite that is applied after join planning is done. This topic describes only those rewrites that are applied by the Query Rewrite subsystem.

    The following diagram shows the order in which the various Query Rewrite techniques are applied. Note that after the Pushing Joins Into UNION ALL Views stage, the Query Rewrite process loops back to the stage at which outer joins are converted to inner joins. This is because some of the later rewrites in the first round of the process often create new opportunities for additional rewrites that can be exploited to make Q’ run faster still.

    The Query Rewrite subsystem undertakes the following sequence of stages when it rewrites a request. Note that it does not apply all of the listed stages to all requests.

    The stages that might or might not occur are all preceded by the phrase “when possible.”

    1 Query Rewrite accepts the ResTree from the Resolver as input.

    The goal of Query Rewrite is to rewrite the SQL text in such a way that it produces a new ResTree that is semantically identical to the ResTree passed to it from the Resolver, but is revised in such a way that it will run faster than the original SQL text would have once it is optimized.

    2 When possible, Query Rewrite pushes projections into spooled views.

    This means that any columns projected by the original query that do not contribute to its final outcome are dropped from the ResTree.

    If it cannot push the columns projected by the query into spooled views, or after all possible column projections have been pushed into spooled views, Query Rewrite passes the query to the outer join‑to‑inner join conversion stage of the process.

    3 When possible, Query Rewrite rewrites outer joins as inner joins.

    Query Rewrite always converts outer joins to inner joins if the conversion does not produce spurious non‑matching rows.

    If it cannot convert outer joins to inner joins, or after all possible outer join‑to‑inner join rewrites have been done, Query Rewrite passes the query to the next stage of the process.

    4 When possible, Query Rewrite folds views and derived tables.

    Folding means that any views or derived tables specified in the query are rewritten using their underlying base tables and query‑covering indexes to make all their references both explicit and more optimal.

    There are two types of view folding:

  • Type 1
  • Type 1 view folding is defined by exclusion to mean all view folding that is not Type 2 view folding.

  • Type 2
  • Type 2 view folding is used to rewrite the following set of special expressions that are not handled by Type 1 view folding:

  • Views and derived tables that have constants or null manipulating CASE expressions in their definitions
  • and

  • Such views and derived tables are used as inner tables in OUTER JOIN operations
  • and

  • The constants or null‑manipulating CASE expression inside the view and derived table definitions are referenced in the outer query block.
  • All other view and derived table folding is handled by Type 1 folding.

    If there are no views or derived tables to be folded, or after all possible view and derived table folding has been done, Query Rewrite passes the query to the next stage of the process.

    5 When possible, Query Rewrite uses logical satisfiability and transitive closure (SAT‑TC) to eliminate terms from the query (see “Predicate Simplification” on page 94 for details about SAT‑TC).

    For example, a predicate of WHERE 1=0 cannot be satisfied, so it can be eliminated from the query.

    Similarly, if transitive closure can be employed to eliminate redundant steps toward reaching satisfiability, then those intermediate steps can be eliminated.

    Generally speaking, satisfiability is a by-product of transitive closure.

    If it cannot eliminate terms using satisfiability and transitive closure, Query Rewrite passes the query to the next stage of the process.

    Predicate Simplification is also a component of SAT‑TC (see “Predicate Simplification” on page 94 for details).

    The following techniques are the components of Predicate Simplification in Query Rewrite:

  • Constant predicate evaluation
  • Range‑based simplification
  • Constant movearound
  • Duplicate predicate removal
  • NULLEQ marking
  • Although Constant Predicate Evaluation and NULLEQ Marking are both components of Predicate Simplification conceptually, the system actually performs Constant Predicate Evaluation during the preprocessing phase of Query Rewrite, while it performs NULLEQ Marking during the postprocessing phase of Query Rewrite.

    6 When possible, Query Rewrite pushes predicate conditions into spooled views (see “Predicate Pushdown and Pullup” on page 104 for details about pushing predicate conditions into spooled views).

    What this means is that predicates can often be moved into another query block, making it possible to eliminate the query block from which they originated and consolidating the number of blocks that must be processed.

    Pushing predicate conditions into spooled views helps performance by reducing the size of spooled view files.

    If no conditions can be pushed into spooled views, or after all possible conditions have been pushed into spooled views, Query Rewrite passes the query to the next stage of the process.

    7 When possible, Query Rewrite eliminates join operations that do not contribute to the final result of the query (see “Eliminating Redundant Joins” on page 107 for details about this operation).

    The simplification made by eliminating joins within a view or the main query can provide more opportunities to fold views during the second iteration of the Query Rewrite process.

    8 Any of the rewrites done in stages 3 through 8 might make it possible to do additional rewrites that had not been possible during the first pass Query Rewrite made on the query; therefore, those stages are repeated a second time to capitalize on any such possibilities that were created by the first pass through the process.

    9 Query Rewrite passes ResTree´, which is semantically identical to the ResTree that entered the Query Rewrite subsystem from the Resolver, to the Optimizer for optimization of the access and join paths for the query.

    See “Query Rewrite” on page 72 for more details about the Query Rewrite process and subsystem.