How Teradata Database Resolves Multiple Grouping Sets Specifications

How Teradata Database Resolves Multiple Grouping Sets Specifications - Teradata Database

SQL Data Manipulation Language

Product

Teradata Database

Release Number

15.10

Language

English (United States)

Last Update

2018-10-06

dita:id

B035-1146

lifecycle

Product Category

Teradata® Database

Teradata Database resolves multiple grouping sets specifications by concatenating pairwise the individual elements of the different sets. For information about grouping sets specifications, see “CUBE Grouping Set Option” on page 136, “GROUPING SETS Option” on page 139, and “ROLLUP Grouping Set Option” on page 142.

This is trivial in the case of a simple grouping specification because it is a set containing only one element. However, when applied to more complicated specifications that contain multiple grouping specification elements, the resolution is more complicated.

For example, the following two GROUP BY clauses are semantically identical:

     GROUP BY GROUPING SETS ((A,B), (C)), GROUPING SETS ((X,Y),())

     GROUP BY GROUPING SETS ((A,B,X,Y),(A,B),(C,X,Y),(C))

Example : Semantically Identical Grouping Sets Specifications and Their Resolution

The following three queries return the same results set because they are semantically identical.

     SELECT y,m,r, SUM(u)

     FROM test

     GROUP BY CUBE(y,m), CUBE(r)

     ORDER BY 1,2,3;

     SELECT y,m,r,SUM(u)

     FROM test

     GROUP BY CUBE(y,m,r)

     ORDER BY 1,2,3;

     SELECT y,m,r,SUM(u)

     FROM test

     GROUP BY GROUPING SETS(y,()), GROUPING SETS(m,()),

              GROUPING SETS(r,())

     ORDER BY 1,2,3;

The following three queries return the same results set because they are semantically identical.

     SELECT y, m, r, s, SUM(u)

     FROM test

     GROUP BY ROLLUP(y,m),ROLLUP(r,s)

     ORDER BY 1,2,3,4;

     SELECT y, m, r, s, SUM(u)

     FROM test

     GROUP BY GROUPING SETS((y, m),(),y),ROLLUP(r,s)

     ORDER BY 1,2,3,4;

     SELECT y, m, r, s, SUM(u)

     FROM test

     GROUP BY GROUPING SETS((y,m),(),y),GROUPING SETS((),r,(r,s))

     ORDER BY 1,2,3,4;

These examples show simple uses of the GROUP BY clause.

Examples for the ROLLUP, CUBE, and GROUPING SETS extended GROUP BY operators are provided in “ROLLUP Grouping Set Option” on page 142, “CUBE Grouping Set Option” on page 136, and “GROUPING SETS Option” on page 139, respectively.

Example : Simple GROUP BY Operation

Generate a report of salary totals by department, the result might look something like the report that follows.

     SELECT dept_no, SUM(salary)

     FROM employee

     GROUP BY dept_no;

The result might be similar to the report that follows.

`dept_no` `-------`	`SUM(salary)` `-----------`
`100`	`180,500.00`
`300`	`143,000.00`
`500`	`268,000.00`
`600`	`146,600.00`
`700`	`113,000.00`

Example : Specifying a GROUP BY Clause on Nonaggregate Expressions When the Select List Includes an Aggregate

If you specify an aggregate in the select expression list of a query, then you must also specify a GROUP BY clause that includes all nonaggregate expressions from the select list. Otherwise, the system returns the following message.

     Selected non‑aggregate values must be part of the associated group.

The system returns error message 3504 whenever an aggregate query includes a nonaggregate expression in its SELECT list, WHERE clause, ORDER BY clause, or HAVING clause, but not in a GROUP BY clause.

The system also returns this error when ORDER BY and WITH clauses specify aggregates, but the select list for the query does not.

For example, Teradata Database aborts the following query because it does not specify a GROUP BY clause that groups on the only nonaggregate expression in the select list, which is department_number:

     SELECT department_number, SUM(salary_amount)

     FROM employee

     WHERE department_number IN (100, 200, 300);

To work as intended, the query must be rewritten with an appropriate GROUP BY clause:

     SELECT department_number, SUM(salary_amount)

     FROM employee

     WHERE department_number IN (100, 200, 300)

     GROUP BY department_number;

The following statement aborts and returns an error because it does not specify all of the nonaggregate columns from the select list in its GROUP BY clause.

     SELECT employee.dept_no, department.dept_name, AVG(salary)

     FROM employee, department

     WHERE employee.dept_no = department.dept_no

     GROUP BY employee.dept_no;

In this case, the qualified nonaggregate column department.dept_name is missing from the GROUP BY clause.

The following statement aborts and returns an error because the nonaggregate grouping column specified in the ORDER BY clause is not also specified in the GROUP BY clause.

     SELECT employee.dept_no, AVG(salary)

     FROM employee, department

     WHERE employee.dept_no = department.dept_no

     ORDER BY department.dept_name

     GROUP BY employee.dept_no;

The following statement, based on the table definitions shown, aborts and returns an error because the aggregate query includes a nonaggregate expression, d1, in its HAVING clause, but not in the GROUP BY clause.

     CREATE TABLE t1(a1 int, b1 int, c1 int, d1 int);

     CREATE TABLE t2(a2 int, b2 int, c2 int, d2 int);

     SELECT min(a1) as i, max(b1) as j from t1

     GROUP BY c1

     HAVING 30 >= (sel count(*) from t2 where t1.d1=5);

The correct form of the query includes the nonaggregate expression, d1, in its HAVING clause and in the GROUP BY clause.

     SELECT min(a1) as i, max(b1) as j from t1

     GROUP BY c1, d1

     HAVING 30 >= (sel count(*) from t2 where t1.d1=5);

Example : Specifying GROUP BY Using an Ordered Analytic Function

The following statement specifies a GROUP BY clause with an ordered analytical function to generate report breaks where the function resets and computes a new value for the next grouping.

The example groups all items into percentile by profitability for each store and then returns only the items of interest, which, in this case, are the lowest percentile for each store.

     SELECT store, item, profit, QUANTILE(100, profit) AS percentile

     FROM (SELECT items.item, SUM(sales) -

                  (COUNT(sales)*items.item_cost) AS profit

           FROM daily_sales, items

           WHERE daily_sales.item = items.item

           GROUP BY items.item,items.itemcost) AS item_profit

     GROUP BY store, item, profit, percentile

     QUALIFY percentile = 99;

The result of this query looks like the following table:

store -----	item ----	profit ------	percentile ----------
Eastside	Golf balls	100.19	99
Westside	Tennis balls	-110.00	99
Central	Bowling balls	-986.81	99
South	Codfish balls	- 1,891.89	99
North	Bocce balls	1,180.88	99

Example : SELECT Statement With a Scalar Subquery in Its GROUP BY Clause

The following example specifies a scalar subquery in its GROUP BY clause:

     SELECT sale_date, SUM(amount)

     FROM sales_table AS s

     GROUP BY sale_date, (SELECT prod_name

                          FROM prod_table AS p

                          WHERE p.prod_no = s.prod_no);

Example : GROUP BY and PERIOD Value Expressions

The following example shows how the GROUP BY clause operates on PERIOD value expressions, where period_of_stay is the PERIOD value expression.

     SELECT emp_no, period_of_stay

     FROM employee

     GROUP BY emp_no, period_of_stay;

For more information related to the GROUP BY clause, see:

“How Teradata Database Resolves Multiple Grouping Sets Specifications” on page 131

“CUBE Grouping Set Option” on page 136

“GROUPING SETS Option” on page 139

“ROLLUP Grouping Set Option” on page 142

“ORDER BY Clause” on page 202