Incremental Planning and Execution | Optimizer Process | VantageCloud Lake - Incremental Planning and Execution

Incremental Planning and Execution | Optimizer Process | VantageCloud Lake - Incremental Planning and Execution - Teradata Vantage

Teradata® VantageCloud Lake

Deployment

VantageCloud

Edition

Lake

Product

Teradata Vantage

Published

January 2023

Language

English (United States)

Last Update

2024-04-03

dita:mapPath

phg1621910019905.ditamap

dita:ditavalPath

pny1626732985837.ditaval

dita:id

phg1621910019905

Incremental Planning and Execution (IPE) is an adaptive query optimization framework in which the query optimizer can partially plan and run a request incrementally. Vantage dynamically collects the intermediate results or statistics from each partial execution. The optimizer uses results feedback to rewrite the remaining portion of the request; this enables advanced optimizations such as partition elimination, join elimination, join index rewrite, and predicate simplification. The optimizer uses statistics feedback to produce a more accurate cost estimate, which is used to choose an optimal execution plan.

The following are types of dynamic feedback:

Results feedback:

Single-row relations: base relations with equality predicates on UPI or USI.
Noncorrelated scalar subqueries.
Noncorrelated EXISTS or NOT EXISTS subqueries in a WHERE clause.
Noncorrelated single-column IN, NOT IN, ANY, or ALL subquery in a WHERE clause.
Single-row spools: nonfolded derived tables or views for which the result is determined to be one row.

Statistics feedback:

Nonfolded derived tables or views.
Remote tables, table operators, and table functions.
Intermediate spools input to aggregations or analytic functions.

Based on execution, feedback kind is dynamically switched between results and statistics. AMPs choose feedback kind based on execution and result size, regardless of the feedback kind requested by the optimizer:

For the original results feedback, if the size of the target spool is too big, switching from results feedback to statistics feedback reduces processing loads on the optimizer to minimize parsing time.
For the original statistics feedback, if the target spool is empty or contains only one row, switching from statistics feedback to results feedback enables complex queries to benefit from results feedback.

Dynamic Planning and Multiple-Statement Requests

Dynamic planning for IPE can also be performed for multiple-statement requests (MSRs). MSRs often include complex query statements which can benefit from dynamic planning.

MSRs in FastExport mode is not eligible for dynamic planning.

Dynamic Plan Caching

Dynamic plans for IPE can be cached if the plan includes only statistics feedback. This type of caching can provide significant performance improvements for complex queries that are submitted often. Each PE maintains a separate cache. You can use DBQL to determine if dynamic plans are being cached.

When a new request qualifies for IPE, the optimizer checks whether dynamic planning has only statistics feedback. The second time the same request is submitted, the dynamic plan is generated again and cached. The third time the request is submitted, the cached dynamic plan is used.

Dynamic plans cannot be cached if the dynamic plan:

Includes a remote table access.
Is generated for a parameterized request.

Results Feedback for Unique-Join Tables

A unique-join table (UJT) is a base table which is joined by its unique columns with a predicate on the base table. The UJT is rewritten as an IN-list predicate. The query rewrite triggered by results feedback for UJTs can provide a significant performance improvement when a UJT is empty and expensive join operations can be eliminated.

To qualify as a UJT, the base table cannot be a DBC table, and the non-join columns of the base table cannot be referenced in the SELECT clause.

Example: IPE results feedback for a noncorrelated scalar subquery

First the optimizer detects the noncorrelated scalar subquery (bolded in the example).

SELECT *
FROM t1, (SELECT t2.a2
          FROM t2, t3
          WHERE t2.a2 = t3.a3
          GROUP BY t2.a2) AS dt (a2)
WHERE t1.a1 < (SELECT MAX(t4.a4)
               FROM t4)
AND   t1.a1 = dt.a2;

Then the optimizer forms a fragment to run the scalar subquery. The result of the scalar subquery is fed back to the optimizer and the result is substituted for the scalar subquery.

Assume that the result is 20. The substitution enables the following query rewrites:

Using transitive closure to derive predicates.

SELECT *
FROM t1, (SELECT t2.a2
          FROM t2, t3
          WHERE t2.a2 = t3.a3
          GROUP BY t2.a2) AS dt (a2)
WHERE t1.a1 < 20 
AND   t1.a1 = dt.a2
AND   dt.a2 < 20;

Pushing the derived predicate into a derived table.

SELECT *
FROM t1, (SELECT t2.a2
          FROM t2, t3
          WHERE t2.a2 = t3.a3
          AND   t2.a2 < 20 
          GROUP BY t2.a2) AS dt (a2)
WHERE t1.a1 <  20 
AND   t1.a1 = dt.a2
AND   dt.a2 < 20;

Using transitive closure to derive predicates inside the derived table.

SELECT *
FROM t1, (SELECT t2.a2
          FROM t2, t3
          WHERE t2.a2 = t3.a3
          AND   t2.a2 < 20 
          AND   t3.a3 < 20 
          GROUP BY t2.a2) AS dt (a2)
WHERE t1.a1 <  20 
AND   t1.a1 = dt.a2
AND   dt.a2 < 20;

Further advanced optimizations such as partition elimination can then occur.