The justification for predicate pushdown is to move operations as close to the beginning of query processing as possible in order to eliminate as many rows and columns as possible to reduce the cost of the query to a minimal value. This manipulation is particularly important in the Teradata parallel environment because it reduces the number of rows that must be moved across the BYNET to other AMPs. Because query processing begins at the bottom of the parse tree, these predicates are said to be pushed down the tree (The concept of parse trees is described in Translation to Internal Representation, as is predicate pushdown).
Pushing predicates enhances performance by reducing the cardinality of spools for views and derived tables that cannot be folded.
Predicate pullup is used less frequently than pushdown in query rewrite and optimization. Its typical purpose is to move more expensive operations further back in the processing queue so they have fewer rows and columns on which to operate after other query rewrites have been done. Because query processing begins at the bottom of the parse tree, these predicates are said to be pulled up the tree.
This rewrite method pushes predicates from containing query blocks into views or derived tables whenever possible.
Examples of Predicate Pushdown and Pullup
Suppose you have the following request:
SELECT MAX(total) FROM (SELECT product_key, product_name, SUM(quantity*amount) AS total FROM Sales, Product WHERE sales_product_key=product_key GROUP BY product_key, product_name) AS v WHERE product_key IN (10, 20, 30);
Predicate pushdown can move the outer WHERE clause predicate for this request into the derived table, v, and then evaluate it as part of the WHERE clause for v.
Pushing predicates helps performance by reducing the cardinality of spools for views and derived tables that cannot be folded.
A significant component of predicate pushdown is devoted to pushing conditions into spooled views or spooled derived tables. This rewrite analyzes conditions that reference a view, maps those conditions to the base tables of the view definition, and then appends them to the view definition. Such a rewrite has the potential to improve the materialization of that view.
For example, consider the derived table dt(x,y,z) in the following query:
SELECT * FROM (SELECT * FROM t1) AS dt(x,y,z) WHERE x=1;
If the derived table in the query is spooled, then the condition a1=1 can be pushed into it. The rewrite produces the following query:
SELECT * FROM (SELECT * FROM t1 WHERE a1=1) AS dt(x,y,z) WHERE x=1;
Note that the original query would have required a full-table scan of t1, while the rewritten query requires only a single-AMP access of t1, assuming a1 is the primary index for t1.
Consider another view definition and a specific query against it:
CREATE VIEW v (a, b, c) AS SELECT a1, a2, SUM(a3) FROM t1, t2, t3 WHERE b1=b2 AND c2=c3 GROUP BY a1, a2; SELECT v.a, v.b FROM v, t4 WHERE v.a=a4 ;
Because view column v.c is not referenced by the containing query block, the SUM(a3) term can be removed from the rewritten select list of the view definition for this request. This action reduces the size the spool for the view (assuming the view is spooled) and eliminates the unnecessary computation of the aggregate term SUM(a3).
Pushing projections can enable other rewrites as well. For example, consider the following table definitions:
CREATE TABLE t1 ( a1 INTEGER NOT NULL, b1 INTEGER, PRIMARY KEY (a1) ); CREATE TABLE t2 ( a2 INTEGER, b2 INTEGER, FOREIGN KEY (a2) REFERENCES t1);
Join elimination can be applied to the following request if the references to t1 are removed from the select list of the derived table.
SELECT 1 FROM (SELECT * FROM t1,t2 WHERE a1=a2) AS dt;