Usage Notes - Teradata Database

GROUP BY Clause Terminology

A GROUP BY clause is said to be simple if it does not contain any of the following elements:

rollup_list
cube_list
grouping_sets_specification

A GROUP BY clause is said to be primitive if it does not contain any of the following elements:

rollup_list
cube_list
grouping_sets_specification
a grouping_expressionand an empty_grouping_set.

See “GROUPING SETS Option” for definitions of rollup_list, cube_list, grouping_sets_specification, and empty_grouping_set.

See the following topics for examples of non-simple, non-primitive GROUP BY clauses:

Ordinary Grouping Set Expressions

The following table provides the definitions for the valid ordinary grouping set expressions you can specify in a GROUP BY clause.

Ordinary Grouping Set Expression	Definition
column_name	Set of column names drawn from the list of tables specified in the FROM clause of the SELECT statement that is used in the GROUP BY clause to specify the columns by which data is to be grouped. The maximum number of columns you can specify is 64. You cannot include LOB columns in the grouping expression. You can specify a column_name_alias instead of column_name as long as it does not have the same name as a physical column in the table definition. In this case, you must specify column_position, not column_name_alias.
column_position	Sequential numeric position of columns within the column_list clause of the SELECT statement that is used in the GROUP BY clause to specify the order by which data is to be grouped. The value you specify must be a positive constant integer literal with a value between 1 and the number of columns specified in the select list, inclusive. Note that Teradata Database treats macro and procedure parameters as expressions, not as the specification of a column position. You cannot include LOB columns in the grouping expression. This is a Teradata extension to the ANSI SQL:2011 standard.
column_expression	List of valid SQL expressions specified for the GROUP BY clause. You can specify column_name, column_position, and expression either as single entries or as a list. You can specify a scalar subquery as an ordinary grouping set expression. You can also specify a scalar UDF as an ordinary grouping set expression. You cannot include LOB columns in the ordinary grouping set. Use of column_expression is a Teradata extension to the ANSI SQL:2011 standard.

Ordinary Grouping Set Expression

Definition

column_name

Set of column names drawn from the list of tables specified in the FROM clause of the SELECT statement that is used in the GROUP BY clause to specify the columns by which data is to be grouped.

The maximum number of columns you can specify is 64.

You cannot include LOB columns in the grouping expression.

You can specify a column_name_alias instead of column_name as long as it does not have the same name as a physical column in the table definition. In this case, you must specify column_position, not column_name_alias.

column_position

Sequential numeric position of columns within the column_list clause of the SELECT statement that is used in the GROUP BY clause to specify the order by which data is to be grouped.

The value you specify must be a positive constant integer literal with a value between 1 and the number of columns specified in the select list, inclusive. Note that Teradata Database treats macro and procedure parameters as expressions, not as the specification of a column position.

You cannot include LOB columns in the grouping expression.

This is a Teradata extension to the ANSI SQL:2011 standard.

column_expression

List of valid SQL expressions specified for the GROUP BY clause.

You can specify column_name, column_position, and expression either as single entries or as a list.

You can specify a scalar subquery as an ordinary grouping set expression.

You can also specify a scalar UDF as an ordinary grouping set expression.

You cannot include LOB columns in the ordinary grouping set.

Use of column_expression is a Teradata extension to the ANSI SQL:2011 standard.

GROUP BY and Aggregate Operations

For aggregate operations, including SUM, AVERAGE, MAX, MIN, and COUNT, GROUP BY can be used to return a summary row for each group.

Aggregate operators can be used only in the SELECT expression list or in the optional HAVING clause.

GROUP BY and Nonaggregate Operations

Nonaggregated variables in SELECT, ORDER BY, and HAVING need to appear in the group by list.

All nonaggregate groups in a SELECT expression list or HAVING expression list must be included in the GROUP BY clause.

GROUP BY and DISTINCT

For cases when the GROUP BY is semantically equivalent to DISTINCT, the optimizer makes a cost-based decision to eliminate duplicates either by way of a combined sort and duplicate elimination or an aggregation step.

Comparing GROUP BY and Correlated Subqueries Using a Scalar UDF

You can specify a correlated subquery using a scalar UDF in the same way that you would specify a column name or parameterized value in the GROUP BY clause.

GROUP BY and Built-in Ordered Analytic Functions

For built-in ordered analytic functions specific to Teradata Database, such as CSUM and MAVG, GROUP BY determines the partitions over which the function executes. For an example, see Example: Specifying GROUP BY Using an Ordered Analytic Function.

For window functions, such as SUM and AVG, GROUP BY collapses all rows with the same value for the group-by columns into a single row. The GROUP BY clause must include all the columns specified in the:

Select list of the SELECT clause
Window functions in the select list of a SELECT clause
Window functions in the search condition of a QUALIFY clause
The condition in a RESET WHEN clause

For examples and more information on GROUP BY and ordered analytical functions, see SQL Functions, Operators, Expressions, and Predicates .

BLOB or CLOB columns Not Allowed in a GROUP BY Expression

You cannot specify BLOB or CLOB columns in a grouping expression, rollup list, cube list, or grouping sets specification.

GROUP BY and Recursive Queries

You cannot specify a GROUP BY clause in a recursive statement of a recursive query. However, you can specify a GROUP BY clause in a nonrecursive seed statement in a recursive query.

ORDER BY and GROUP BY

If you specify an ORDER BY clause, then any group contained in the ORDER BY clause must also be included in the GROUP BY clause.

WHERE, GROUP BY, and HAVING Clause Evaluation

WHERE, GROUP BY, and HAVING clauses in a SELECT statement are evaluated in this order:

WHERE
GROUP BY
HAVING

Reason for Unexpected Row Length Errors: Sorting Rows for Grouping

Before performing the sort operation that groups the rows to be returned to the requestor, Teradata Database creates a sort key and appends it to the rows to be sorted. If the length of this temporary data structure exceeds the system row length limit of 64 KB, the operation returns an error to the requestor. Depending on the situation, the message text is one of the following:

A data row is too long.
Maximum row length exceeded in database_object_name.

For explanations of these messages, see Messages .

The BYNET only looks at the first 4096 bytes of the sort key created to sort the specified fields, so if the field the sort key is based on is greater than 4096 bytes, the key is truncated and the data might not be returned in the desired order.

How Teradata Database Resolves Multiple Grouping Sets Specifications

Teradata Database resolves multiple grouping sets specifications by concatenating pairwise the individual elements of the different sets. For information about grouping sets specifications, see “CUBE Grouping Set Option”, “GROUPING SETS Option”, and “ROLLUP Grouping Set Option”.

This is trivial in the case of a simple grouping specification because it is a set containing only one element. However, when applied to more complicated specifications that contain multiple grouping specification elements, the resolution is more complicated.

For example, the following two GROUP BY clauses are semantically identical:

     GROUP BY GROUPING SETS ((A,B), (C)), GROUPING SETS ((X,Y),())
     GROUP BY GROUPING SETS ((A,B,X,Y),(A,B),(C,X,Y),(C))

Example: Semantically Identical Grouping Sets Specifications and Their Resolution

The following three queries return the same results set because they are semantically identical.

     SELECT y,m,r, SUM(u)
     FROM test
     GROUP BY CUBE(y,m), CUBE(r)
     ORDER BY 1,2,3;

     SELECT y,m,r,SUM(u)
     FROM test
     GROUP BY CUBE(y,m,r)
     ORDER BY 1,2,3;

     SELECT y,m,r,SUM(u)
     FROM test
     GROUP BY GROUPING SETS(y,()), GROUPING SETS(m,()),
              GROUPING SETS(r,())
     ORDER BY 1,2,3;

The following three queries return the same results set because they are semantically identical.

     SELECT y, m, r, s, SUM(u)
     FROM test
     GROUP BY ROLLUP(y,m),ROLLUP(r,s)
     ORDER BY 1,2,3,4;

     SELECT y, m, r, s, SUM(u)
     FROM test
     GROUP BY GROUPING SETS((y, m),(),y),ROLLUP(r,s)
     ORDER BY 1,2,3,4;

     SELECT y, m, r, s, SUM(u)
     FROM test
     GROUP BY GROUPING SETS((y,m),(),y),GROUPING SETS((),r,(r,s))
     ORDER BY 1,2,3,4;