CUBE Grouping Set Option - Teradata Database

SQL Data Manipulation Language

Product

Teradata Database

Release Number

16.10

Published

June 2017

Language

English (United States)

Last Update

2018-04-25

dita:mapPath

psg1480972718197.ditamap

dita:ditavalPath

changebar_rev_16_10_exclude_audience_ie.ditaval

dita:id

B035-1146

lifecycle

Product Category

Teradata® Database

Purpose

Analyzes data by grouping it into multiple dimensions.

Syntax

Syntax Elements

ordinary_grouping_set: one or more expressions used to group cubed report rows across multiple dimensions.; All ordinary grouping sets must make explicit column references. You cannot specify column positions with CUBE.; You cannot specify more than 8 columns in the ordinary grouping sets.; This limit is actually on the number of table dimensions that can be cubed. The value is much smaller than the limits for the ROLLUP and GROUPING SETS options because it specifies a limit of 28, or 256 different combinations of columns, not just columns.; You cannot reference columns that have a LOB data type unless they are first CAST to another type or passed to a function whose result is not a LOB. For example, casting a BLOB to BYTE or VARBYTE or a CLOB to CHARACTER or VARCHAR.

ANSI Compliance

CUBE is ANSI SQL:2011 compliant.

Rules and Restrictions for CUBE

You cannot specify a SAMPLE clause in a query that specifies the CUBE option in a GROUP BY clause, with the exception of a query where the CUBE option in the GROUP BY clause appears in a derived table or view and the SAMPLE clause appears in the outer query.

How CUBE Summarizes Data

Given n dimensions, CUBE generates 2n different kinds of groups. Teradata Database reports each group as a single row.

For example, with 2 dimensions, CUBE generates 4 groups as follows:

group number	dimension 1	dimension 2
1	X	X
2	X
3		X
4

With 3 dimensions, CUBE generates 8 groups as follows:

group number	dimension 1	dimension 2	dimension 3
1	X	X	X
2	X	X
3	X		X
4	X
5		X	X
6		X
7			X
8

Example: CUBE Grouping

Suppose you have the following sales_view table data:

sales_view
PID	cost	sale	margin	state	county	city
1	38350	50150	11800	CA	Los Angeles	Long Beach
1	63375	82875	19500	CA	San Diego	San Diego
1	46800	61200	14400	CA	Los Angeles	Avalon
1	40625	53125	12500	CA	Los Angeles	Long Beach
2	20000	24750	4750	CA	Los Angeles	Long Beach
2	4800	5940	1140	CA	San Diego	Santee
1	57600	71280	13680	CA	San Diego	San Diego

You are interested in determining the effect of county and PID on sale price.

The following SELECT statement analyzes the effect of county and PID on sale price using CUBE in its GROUP BY clause:

     SELECT pid, county, SUM(sale)
     FROM sales_view
     GROUP BY CUBE (pid,county);

The query reports the following data, where the QUESTION MARK character indicates a null:

PID	County	Sum(sale)
---	------	---------
2	Los Angeles	24750
1	Los Angeles	164475
2	San Diego	5940
1	San Diego	154155
2	?	30690
1	?	381630
?	Los Angeles	189225
?	San Diego	160095
?	?	349320

This query computes four different levels of grouping over the dimensions of county and PID by:

county and PID (groups 1 through 4).
PID only (groups 5 and 6).
county only (groups 7 and 8).
Nothing (group 9), which aggregates sales over all counties and PIDs.

Notice that nulls are used to represent the empty set, not missing information. The nulls mean that information is not reported at the grouping level represented, not that the information is missing from the sales_view base table.