15.10 - Partial GROUP BY Block Optimization - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
15.10
category
Programming Reference
User Guide
featnum
B035-1142-151K

The Optimizer costs all alternative access and join plans. Where appropriate, aggregations are split, with part being done before the join and part being done after the join. This means less work for the join as well as less aggregation work after the join. The idea is to reduce the cardinalities as early as possible, which also helps reduce spool use.

The Optimizer applies a Partial GROUP BY optimization to a request if both of the following conditions are true.

  • Early application of aggregations is semantically correct.
  • The Optimizer estimates the Partial GROUP BY to be more cost effective than not applying it.
  • In a join query plan for a query with a GROUP BY clause, Partial GROUP BY operations are used to reduce the cardinalities of base tables, join indexes, and intermediate spools, thereby reducing processing required in performing the join query plan. To further enhance efficiency, certain of the partial group operations, including an intermediate Partial GROUP BY operation, a final GROUP BY operation, or both can be eliminated in response to certain predefined conditions.

    The Partial GROUP BY optimization can be done almost at any stage of the join plan. An early GROUP BY reduces the rows of the base relations or the intermediate join relations during the optimization process. However, GROUP BY and merge join operators have a tendency to compete with one other for use in the lowest cost plan because both apply significant row reduction. Therefore, it is extremely important for the Optimizer to cost the alternatives.

    All evidence shows that the Partial GROUP BY optimization reduces the processing and I/O times for typical requests. The more duplicate grouping values a table has, the greater the reduction in I/O processing which, in some cases, can be as much as a factor of 100 times (see “Performance Analysis of the Partial GROUP BY Optimization” on page 390).

    The Optimizer has two additional paths to reduce the number of rows included in the join:

  • Early GROUP BY
  • The Optimizer does all of the aggregation prior to the join.

  • Partial GROUP BY
  • The Optimizer does some of the aggregation before the join and the rest after the join. It also learns when to skip the last aggregation to improve the performance.

    This optimization was introduced by Chaudhuri and Shim (1994). Some of the Teradata applications of block optimization are described by Ghazal et al. (2003).

    Consider the following typical query:

         SELECT b1, c1, SUM(t1.float_1), SUM(t2.float_2), SUM(t3.float_3)
         FROM t1,t2,t3
         WHERE b1=b2 AND b2=b3
         GROUP BY b1,c1;

    Without Partial GROUP BY enabled, the Optimizer selects one of the following join plans:

  • [(t1 X t2) X t3 ]'
  • [(t2 X t3) X t1]'
  • [(t1 X t3) X t2]'
  • where X represents the join operator and [ ]' denotes the GROUP BY operator applied to the relation.

    With Partial GROUP BY enabled, some additional join plans are considered, such as the following:

  • [([t2]' X t3) X t1]'
  • [(t1 X [t3]') X t2]'
  • [([t2]' X [t3]') X t1]]'
  • [([t1 X t3]' X [t2]'
  • This provides the Optimizer with an opportunity to find a lower cost join plan among a larger set of possible join plans. Even though there are more join plans to be considered, the additional search effort consumes negligible CPU time and returns a significant reduction time in the execution of the request.

    The Partial GROUP BY optimization reduces cardinalities as early as possible in the optimization process.

    Consider the following example:

         SELECT ps_partkey, SUM(l_quantity), SUM (ps_supplycost)
         FROM partsupp, lineitem
         WHERE l_partkey = ps_partkey
         GROUP BY 1; 

    Without Partial GROUP BY enabled, the Optimizer redistributes lineitem on the join column l_partkey, performs the join with partsupp, and then aggregates the result of the join. With a one terabyte database, this redistributes 6 billion rows and performs the join with partsupp to produce a result set of 6 billion rows on which the aggregation is performed.

    With Partial GROUP BY enabled, the Optimizer first separately aggregates the relations partsupp and lineitem and then joins the results of those aggregations.

    Without the transformation, approximately 85 billion rows must be read and written. This is largely because the join of the 6 billion row lineitem relation is joined to the 800 million row partsupp relation to produce a 24 billion row join result spool for aggregation.

    With the transformation, the lineitem relation is aggregated as part of the sort and redistribution operation to produce a 200 million row spool. The partsupp relation is locally aggregated to produce another 200 million row spool. The two spools are then joined to produce a 200 million row result, and there is an overall reduction of about 3 times in the number of rows read and written.

    They are two ways to coalesce rows using Partial GROUP BY:

  • Partial SUM steps (see “Early GROUP BY With a Partial SUM Step” on page 386)
  • Sort/Group steps (see “SORT/GROUP Step and Partial GROUP BY Optimization” on page 387)
  • This section demonstrates the EXPLAIN phrase partial SUM.

    Consider the following query:

         SELECT l_partkey,SUM(l_quantity),SUM(ps_supplycost)
         FROM partsupp, lineitem
         WHERE l_partkey = ps_partkey 
         GROUP BY 1; 

    Without Partial GROUP BY enabled, the join plan would be [partsupp X lineitem]'. With Partial GROUP BY enabled, the join plan is [partsupp]' X [lineitem]'.

    The following EXPLAIN text compares two explanations of this query: the first with Partial GROUP BY disabled and the second with Partial GROUP BY enabled.

    This is partial EXPLAIN text with Partial GROUP BY disabled.

    4)	We do an all-AMPs RETRIEVE step from TPCD50G.lineitem by way of an all-rows scan with no residual conditions into Spool 4 (all_amps), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 4 by row hash.  The result spool file will not be cached in memory.  The size of Spool 4 is estimated with high confidence to be 300,005,811 rows.  The estimated time for this step is 2 hours and 51 minutes.
    5)	We do an all-AMPs JOIN step from TPCD50G.partsupp by way of a RowHash match scan with no residual conditions, which is joined to Spool 4 (Last Use).  TPCD50G.partsupp and Spool 4 are joined using a merge join, with a join condition of ("L_PARTKEY = TPCD50G.partsupp.PS_PARTKEY").  The input table TPCD50G.partsupp will not be cached in memory, but it is eligible for synchronized scanning.  The result goes into Spool 3 (all_amps), which is built-locally on the AMPs.  The result spool file will not be cached in memory.  The size of Spool 3 is estimated with low confidence to be 1,200,023,244 rows.  The estimated time for this step is 2 hours and 46 minutes.
    6)	We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by way of an all-rows scan, and the grouping identifier in field 2.  Aggregate Intermediate Results are computed locally, then placed in Spool 5.  The aggregate spool file will not be cached in memory.  The size of Spool 5 is estimated with low confidence to be 1,200,023,244 rows.  The estimated time for this step is 10 hours and 6 minutes.  
    7)	We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs.  The result spool file will not be cached in memory.  The size of Spool 1 is estimated with low confidence to be 1,200,023,244 rows.  The estimated time for this step is 2 hours and 59 minutes.
    8)	Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  The contents of Spool 1 are sent back to the user as the result on statement 1.

    This is partial EXPLAIN text with Partial GROUP BY enabled.

    4)	We do an all-AMPs partial SUM step to aggregate from TPCD50G.partsupp by way of an all-rows scan with no residual conditions, and the grouping identifier in field 1.  Aggregate Intermediate Results are computed locally, then placed in Spool 6.  The input table will not be cached in memory, but it is eligible for synchronized scanning.  The aggregate spool file will not be cached in memory.  The size of Spool 6 is estimated with low confidence to be 10,000,000 rows.  The estimated time for this step is 12 minutes and 36 seconds.
    5)	We execute the following steps in parallel.
         1) 	We do an all-AMPs RETRIEVE step from Spool 6 (Last Use) by way of an all-rows
            scan into Spool 5 (all_amps), which is built locally on the AMPs.  Then we do a
            SORT to order Spool 5 by row hash.  The result spool file will not be cached
            in memory.  The size of Spool 5 is estimated with no confidence to be
            10,000,000 rows.
         2) 	We do an all-AMPs partial SUM step to aggregate from TPCD50G.lineitem by way
            of an all-rows scan with no residual conditions, and the grouping identifier
            in field 1.  Aggregate Intermediate Results are computed globally, then placed
            in Spool 10.  The input table will not be cached in memory, but it is eligible
            for synchronized scanning.  The aggregate spool file will not be cached in
            memory.  The size of Spool 10 is estimated with low confidence to be
            10,000,000 rows.  The estimated time for this step is 3 hours and 39 minutes.
    6)	We do an all-AMPs RETRIEVE step from Spool 10 (Last Use) by way an all-rows scan into Spool 9 (all_amps), which is redistributed by hash code to all AMPs.  Then we do a SORT to order Spool 9 by row hash.  The result spool file will not be cached in memory.  The size of Spool 9 is estimated with no confidence to be 10,000,000 rows.
    7)	We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of a RowHash match scan, which is joined to Spool 9 (Last Use).  Spool 5 and Spool 9 are joined using a merge join, with a join condition of ("L_PARTKEY = PS_PARTKEY").  The result goes into Spool 3 (all_amps), which is built locally on the AMPs.  The result spool file will not be cached in memory.  The size of Spool 3 is estimated with low confidence to be 993,739,248 rows.  The estimated time for this step is 2 hours and 16 minutes.
    8)	We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs.  The result spool file will not be cached in memory. The size of Spool 1 is estimated with low confidence to be 993,739,248 rows.  The estimated time for this step is 2 hours and 28 minutes.
    9)	Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.  The contents of Spool 1 are sent back to the user as the result of statement 1.  

    With Partial GROUP BY disabled, the SUM step is performed in step 6 where the cardinalities are reduced. In contrast, when Partial GROUP BY is enabled, the partial SUM step is performed in steps 4 and 5.1.

    Because the Partial GROUP BY optimization is performed long before the join, the EXPLAIN text adds the phrase partial before the SUM step for easy identification of the optimization.

    Note that it is possible for the Optimizer to select a join plan that does not use this optimization, even though Partial GROUP BY is enabled. This might happen, for example, if the cost of aggregation makes it not worth doing because of the number of small groups.

    This section introduces the phrase SORT/GROUP as an identification of GROUP BY. The following request has a nested subquery, with the inner query specifying an OUTER JOIN. Both the inner and outer queries have GROUP BY clauses.

         SELECT c_count, COUNT(*) AS custdist
         FROM (SELECT c_custkey, COUNT(o_orderkey)
               FROM customer LEFT OUTER JOIN ordertbl 
                             ON  c_custkey = o_custkey 
                             AND o_comment NOT LIKE '%special%requests%'
               GROUP BY c_custkey) AS c_orders (c_custkey, c_count)
         GROUP BY c_count
         ORDER BY custdist desc, c_count DESC;

    Explanation with Partial GROUP BY enabled.

    Explanation
    ---------------------------------------------------
    1)	First, we lock a distinct TPCD50G."pseudo table" for read on a RowHash to prevent global deadlock for TPCD50G.CUSTOMER.
    2)	Next, we lock a distinct TPCD50G."pseudo table" for read on a RowHash to prevent global deadlock for TPCD50G.ORDERTBL.
    3)	We lock TPCD50G.CUSTOMER for read, and we lock TPCD50G.ORDERTBL for read.
    4)	We do an all-AMPs RETRIEVE step from TPCD50G.ORDERTBL by way of an all-rows scan with a condition of ("NOT(TPCD50G.ORDERTBL.O_COMMENT LIKE '%special%requests%')") into Spool 5 (all_amps), which is redistributed by hash code to all AMPs.  Then we do a SORT/GROUP to order Spool 5 by row hash and non-aggregate fields grouping duplicate rows.  The result spool file will not be cached in memory.  The size of Spool 5 is estimated with no confidence to be 67,500,000 rows.
    5)	We do an all-AMPs JOIN step from TPCD50G.CUSTOMER by way of a RowHash match scan with no residual conditions, which is joined to Spool 5 (Last Use).  TPCD50G.CUSTOMER and Spool 5 are left outer joined using a merge join, with a join condition of ("TPCD50G.CUSTOMER.C_CUSTKEY = O_CUSTKEY").  The input table TPCD50G.CUSTOMER will not be cached in memory.  The result goes into Spool 3 (all_amps), which is built locally on the AMPs.  The result spool file will not be cached in memory.  The size of Spool 3 is estimated with low confidence to be 74,954,952 rows.  The estimated time for this step is 15 minutes and 32 seconds.
    6)	We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (all_amps), which is built locally on the AMPs.  The result spool file will not be cached in memory. The size of Spool 1 is estimated with low confidence to be 74,954,952 rows.  The estimated time for this step is 11 minutes and 6 seconds.
    7)	We do a SUM step to aggregate from Spool 1 (Last Use) by way of an all-rows scan, and the grouping identifier in field 3.  Aggregate Intermediate Results are computed globally, then placed in Spool 10.  The size of Spool 10 is estimated with no confidence to be 8,658 rows.  The estimated time for this step is 1 minute and 26 seconds.
    8)	We do an all-AMPs RETRIEVE step from Spool 10 (Last Use) by way of an all-rows scan into Spool 8 (group_amps), which is built locally on the AMPs.  Then we do a SORT to order Spool 8 by the sort key in spool field1.  The size of Spool 8 is estimated with no confidence to be 8,658 rows.  The estimated time for this step is 0.25 seconds.  Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request. -> The contents of Spool 8 are sent back to the user as the result of statement 1.

    In this explanation, the phrase SORT/GROUP appears in Step 4. That means the Partial GROUP BY is used early to reduce cardinalities.

    When the system executes this query using the EXPLAINed plan, but with Partial GROUP BY disabled, the ratio of

    Partial GROUP BY does aggregations concurrently with joins, and GROUP BY operations might be performed several times. For several reasons, it is helpful for the Optimizer to know when these aggregations are terminated. The fundamental question is this: under what conditions can the last aggregation be eliminated?

    The following are examples of join plans that do not require a final GROUP BY operation.

    Assume you submit the following query for the TPC-H benchmark (see http://www.tpc.org/tpch/spec/tpch2.6.0.pdf for a complete definition of the relations used for this benchmark).

         SELECT l_suppkey,SUM(l_quantity),SUM(ps_availqty) 
         FROM lineitem,partsupp
         WHERE l_suppkey=ps_suppkey 
         GROUP BY 1;

    This query uses the join plan [lineitem]' X [partsupp]'.

    Consider the following request:

         SELECT b1, c1, SUM(t1.float_1), SUM(t2.float_2), SUM(t3.float_3)
         FROM t1, t2, t3
         WHERE b1=b2 
         AND   b2=b3
         GROUP BY 1,2;

    Without Partial GROUP BY, this query would use the join plan [[t2xt3']'X t1']', with a final GROUP BY operation. Because the columns cover join columns, the last GROUP BY can be skipped, and the join plan can be optimized to [t2xt3']'xt1'.

    The last GROUP BY operation need not be performed for the following request:

         SELECT c_custkey, c_name, 
                SUM(l_extendedprice*(1-l_discount)(FLOAT))(DECIMAL(18,2)) 
                  AS revenue, c_acctbal, n_name, c_address,c_phone,c_comment
         FROM customer, ordertbl, lineitem, nation
         WHERE c_custkey = o_custkey
         AND   l_orderkey = o_orderkey
         AND   o_orderdate >= '1993-10-01'
         AND   o_orderdate < DATE '1993-10-01' + INTERVAL '3' MONTH
         AND   l_returnflag = 'R'
         AND   c_nationkey = n_nationkey
         GROUP BY c_custkey, c_name, c_acctbal, c_phone, n_name, c_address,
                  c_comment
         ORDER BY revenue DESC;

    The time savings gained by eliminating this step is 15%. This query might have two possible join plans depending on the system configuration. Neither plan requires the last GROUP BY operation. The two join plans are as follows.

  • ((lineitem X ordertbl)' customer) X nation
  • ((ordertbl X customer) X lineitem)' X nation
  • Suppose you submit the following request against the TPC‑H performance benchmark database.

         SELECT c_custkey,c_name,SUM(l_extendedprice*(1-l_discount)(FLOAT))
                (DECIMAL(18,2)) AS revenue, c_acctbal, n_name, c_address,
                c_phone,c_comment
         FROM customer, ordertbl, lineitem, nation
         WHERE c_custkey = o_custkey
         AND   l_orderkey = o_orderkey
         AND   o_orderdate >= '1993-10-01'
         AND   o_orderdate < DATE '1993-10-01' + INTERVAL '3' MONTH
         AND   l_returnflag = 'R'
         AND   c_nationkey = n_nationkey
         GROUP BY c_custkey, c_name, c_acctbal, c_phone, n_name, c_address,
                  c_comment
         ORDER BY revenue DESC;

    When the definition of the nation table is changed to define the primary index on n_key rather than n_name, the join plan without Partial GROUP BY enabled would include the last GROUP BY as follows: (((lineitem X ordertbl)' X customer) X nation)'. Note the 'GROUP BY operator at the end of the join plan specification.

    The Partial GROUP BY optimization compensates for this change in the primary index definition by removing the last GROUP BY, resulting in the plan((lineitem X ordertbl)' X customer) X nation.

    To demonstrate the advantage of Partial GROUP BY, consider the following experiment using one of the queries from the TPC-H performance benchmark that was run against the lineitem table from the standard database schema for that benchmark (see http://www.tpc.org/tpch/spec/tpch2.6.0.pdf). The variables in the experiment were the number of duplicate grouping column values and whether Partial GROUP BY was enabled or not. The results of this experiment are as follows:

     

    Number of Grouping Column Duplicates

    Execution Time With Partial GROUP BY On (seconds)

    Execution Time With Partial GROUP BY Off (seconds)

    Percent Difference in Execution Time With PGB Enabled or Disabled

                      5

                    10

                    10

                        0.00

                    10

                    10

                    13

                      26.09

                    20

                    10

                    15

                      40.00

                    25

                    10

                    17

                      51.85

                    80

                      6

                    35

                    141.46

    It is fairly obvious from these measurements that as the number of grouping column duplicates increases, the Partial GROUP BY Optimization eliminates significantly more rows than when Partial GROUP BY it is not enabled, resulting in significant reduction in execution times.

    For the Optimizer to know when to apply GROUP BY operators effectively, you should collect statistics on all join and GROUP BY columns in your typical requests.

    For the following query, for example, you should collect statistics on l_orderkey, o_orderkey, o_orderdate, c_custkey, c_nationkey, and n_nationkey.

         SELECT c_custkey, c_name, 
                SUM(l_extendedprice*(1-l_discount)(FLOAT))(DECIMAL(18,2)) 
                  AS revenue, c_acctbal, n_name, c_address,c_phone,c_comment
         FROM customer, ordertbl, lineitem, nation
         WHERE c_custkey=o_custkey
         AND   l_orderkey=o_orderkey
         AND   o_orderdate>='1993-10-01'
         AND   o_orderdate<DATE '1993-10-01' + INTERVAL '3' MONTH
         AND   l_returnflag='R'
         AND   c_nationkey=n_nationkey
         GROUP BY c_custkey, c_name, c_acctbal, c_phone, n_name, c_address,
                  c_comment
         ORDER BY revenue DESC;

    Using another example request, you should collect statistics on l_partkey and ps_partkey for the following query.

         SELECT l_partkey, SUM(l_quantity), SUM(ps_supplycost)
         FROM partsupp, lineitem
         WHERE l_partkey=ps_partkey 
         GROUP BY 1;