16.10 - Incremental Planning and Execution - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
16.10
created_date
June 2017
category
Programming Reference
User Guide
featnum
B035-1142-161K

Incremental Planning and Execution (IPE) is an adaptive query optimization framework in which the query optimizer can partially plan and execute a query incrementally. Teradata Database dynamically collects the intermediate results or statistics on the intermediate results from each partial execution. The optimizer uses results feedback to rewrite the remaining portion of the query; this enables advanced optimizations such as partition elimination, join elimination, join index rewrite, and predicate simplification. The optimizer uses statistics feedback to produce a more accurate cost estimate, which is used to choose an optimal execution plan.

There are eight types of dynamic feedback in the two categories.

Results feedback:
  • Single-row relations: base relations with equality predicates on UPI or USI.
  • Noncorrelated scalar subqueries.
  • Noncorrelated EXISTS or NOT EXISTS subqueries in a WHERE clause.
  • Noncorrelated single-column IN, NOT IN, ANY, or ALL subquery in a WHERE clause.
  • Single-row spools: nonfolded derived tables or views for which the result is determined to be one row.
Statistics feedback:
  • Nonfolded derived tables or views.
  • Remote tables, table operators, and table functions.
  • Intermediate spools input to aggregations or analytic functions.
Based on execution, the kind of feedback is dynamically switched between results and statistics. AMPs choose the kind of feedback based on the actual execution and result size, regardless of the feedback kind requested by the optimizer:
  • For the original results feedback, if the size of the target spool is too big, switching from results feedback to statistics feedback reduces processing loads on the optimizer to minimize parsing time.
  • For the original statistics feedback, if the target spool is empty or contains only one row, switching from statistics feedback to results feedback enables complex queries to benefit from results feedback.

Query Rewrites with Results Feedback

The result of the following types of execution can be fed back to the optimizer as results feedback:

  • Base relation with equality predicates on UPI or USI specifications (called a single row relation)
  • Nonfolded derived table or a view which is guaranteed to return one row (called a single row spool)
  • Noncorrelated EXISTS or a NOT EXISTS subqueries in a WHERE clause
  • Noncorrelated single-column IN, NOT IN, ANY, or ALL subquery in a WHERE clause

Example: IPE results feedback for a noncorrelated scalar subquery

First the optimizer detects the noncorrelated scalar subquery (bolded in the example).

     SELECT *
     FROM t1, (SELECT t2.a2
               FROM t2, t3
               WHERE t2.a2 = t3.a3
               GROUP BY t2.a2) AS dt (a2)
     WHERE t1.a1 < (SELECT MAX(t4.a4)
                    FROM t4)
     AND   t1.a1 = dt.a2;

Then the optimizer forms a fragment to execute the scalar subquery. The result of the scalar subquery is fed back to the optimizer and the result is substituted for the scalar subquery.

Assume that the result is 20. The substitution enables the following query rewrites:
  • Using transitive closure to derive predicates.
    SELECT *
    FROM t1, (SELECT t2.a2
              FROM t2, t3
              WHERE t2.a2 = t3.a3
              GROUP BY t2.a2) AS dt (a2)
    WHERE t1.a1 < 20 
    AND   t1.a1 = dt.a2
    AND   dt.a2 < 20;
  • Pushing the derived predicate into a derived table.
    SELECT *
    FROM t1, (SELECT t2.a2
              FROM t2, t3
              WHERE t2.a2 = t3.a3
              AND   t2.a2 < 20 
              GROUP BY t2.a2) AS dt (a2)
    WHERE t1.a1 <  20 
    AND   t1.a1 = dt.a2
    AND   dt.a2 < 20;
    
  • Using transitive closure to derive predicates inside the derived table.
         
    SELECT *
    FROM t1, (SELECT t2.a2
              FROM t2, t3
              WHERE t2.a2 = t3.a3
              AND   t2.a2 < 20 
              AND   t3.a3 < 20 
              GROUP BY t2.a2) AS dt (a2)
    WHERE t1.a1 <  20 
    AND   t1.a1 = dt.a2
    AND   dt.a2 < 20;

Further advanced optimizations as discussed above, such as partition elimination, can then occur.