15.00 - The Window Specification - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)

The Window Specification

Purpose  

Cumulative, group, moving, or remaining computation of an aggregate function.

Syntax  

where:

 

Syntax element …

Specifies …

AVG
CORR
COUNT
COVAR_POP
COVAR_SAMP
MAX
MIN
REGR_AVGX
REGR_AVGY
REGR_COUNT
REGR_INTERCEPT
REGR_R2
REGR_SLOPE
REGR_SXX
REGR_SXY
REGR_SYY
STDDEV_POP
STDDEV_SAMP
SUM
VAR_POP
VAR_SAMP

the aggregate function and arguments on which the window specification is applied.

For descriptions of aggregate functions and arguments, see Chapter 2: “Aggregate Functions.”

OVER

how values are grouped, ordered, and considered when computing the cumulative, group, or moving function.

Values are grouped according to the PARTITION BY and RESET WHEN clauses, sorted according to the ORDER BY clause, and considered according to the aggregation group within the partition.

PARTITION BY

in its column_reference, or comma-separated list of column references, the group, or groups, over which the function operates.

PARTITION BY is optional. If there are no PARTITION BY or RESET WHEN clauses, then the entire result set, delivered by the FROM clause, constitutes a single group, or partition.

PARTITION BY clause is also called the window partition clause.

ORDER BY

in its value_expression the order in which the values in a group, or partition, are sorted.

ASC

that the results are to be ordered in ascending sort order.

If the sort field is a character string, the system orders it in ascending order according to the definition of the collation sequence for the current session.

The default order is ASC.

DESC

that the results are to be ordered in descending sort order.

If the sort field is a character string, the system orders it in descending order according to the definition of the collation sequence for the current session.

NULLS FIRST

that NULL results are to be listed first.

NULLS LAST

that NULL results are to be listed last.

RESET WHEN

the group or partition, over which the function operates, depending on the evaluation of the specified condition. If the condition evaluates to TRUE, a new dynamic partition is created inside the specified window partition.

RESET WHEN is optional. If there are no RESET WHEN or PARTITION BY clauses, then the entire result set, delivered by the FROM clause, constitutes a single partition.

If RESET WHEN is specified, then the ORDER BY clause must be specified also.

condition

a conditional expression used to determine conditional partitioning. The condition in the RESET WHEN clause is equivalent in scope to the condition in a QUALIFY clause with the additional constraint that nested ordered analytical functions cannot specify a RESET WHEN clause. In addition, you cannot specify SELECT as a nested subquery within the condition.

The condition is applied to the rows in all designated window partitions to create sub-partitions within the particular window partitions.

For more information, see “RESET WHEN Condition Rules” on page 980 and the “QUALIFY Clause” in SQL Data Manipulation Language.

ROWS

the starting point for the aggregation group within the partition. The aggregation group end is the current row.

The aggregation group of a row R is a set of rows, defined relative to R in the ordering of the rows within the partition.

If there are no ROWS or ROWS BETWEEN clause, the default aggregation group is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

The default when there is no ROWS clause for FIRST_VALUE/LAST_VALUE is different. See “FIRST_VALUE / LAST_VALUE” on page 1018.

ROWS BETWEEN

the aggregation group start and end, which defines a set of rows relative to the current row in the ordering of the rows within the partition.

The row specified by the group start must precede the row specified by the group end.

If there are no ROWS or ROWS BETWEEN clause, the default aggregation group is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

UNBOUNDED PRECEDING

the entire partition preceding the current row.

UNBOUNDED FOLLOWING

the entire partition following the current row.

CURRENT ROW

the start or end of the aggregation group as the current row.

value PRECEDING

the number of rows preceding the current row.

The value for value is always a positive integer literal.

The maximum number of rows in an aggregation group is 4096 when value PRECEDING appears as the group start or group end.

value FOLLOWING

the number of rows following the current row.

The value for value is always a positive integer literal.

The maximum number of rows in an aggregation group is 4096 when value FOLLOWING appears as the group start or group end.

ANSI Compliance

Window aggregate functions are partially ANSI SQL:2011 compliant.

In the presence of an ORDER BY clause and the absence of a ROWS or ROWS BETWEEN clause, ANSI SQL:2011 window aggregate functions use ROWS UNBOUNDED PRECEDING as the default aggregation group, whereas Teradata SQL window aggregate functions use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

The RESET WHEN clause is a Teradata extension to the ANSI SQL standard.

Type of Computation

Arguments to Window Aggregate Functions

To compute this type of function …

Use this aggregation group …

Cumulative

  • ROWS UNBOUNDED PRECEDING
  • ROWS BETWEEN UNBOUNDED PRECEDING AND value PRECEDING
  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  • ROWS BETWEEN UNBOUNDED PRECEDING AND value FOLLOWING
  • Group

    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

    Moving

  • ROWS value PRECEDING
  • ROWS CURRENT ROW
  • ROWS BETWEEN value PRECEDING AND value PRECEDING
  • ROWS BETWEEN value PRECEDING AND CURRENT ROW
  • ROWS BETWEEN value PRECEDING AND value FOLLOWING
  • ROWS BETWEEN CURRENT ROW AND CURRENT ROW
  • ROWS BETWEEN CURRENT ROW AND value FOLLOWING
  • ROWS BETWEEN value FOLLOWING AND value FOLLOWING
  • Remaining

  • ROWS BETWEEN value PRECEDING AND UNBOUNDED FOLLOWING
  • ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  • ROWS BETWEEN value FOLLOWING AND UNBOUNDED FOLLOWING
  • Window aggregate functions can take literals, literal expressions, column names (sales, for example), or column expressions (sales + profit) as arguments.

    Window aggregates can also take regular aggregates as input parameters to the PARTITION BY and ORDER BY clauses. The RESET WHEN clause can take an aggregate as part of the RESET WHEN condition clause.

    COUNT can take “*” as an input argument, as in the following SQL query:

       SELECT city, kind, sales, profit,
       COUNT(*) OVER (PARTITION BY city, kind 
                      ROWS BETWEEN UNBOUNDED PRECEDING AND 
                      UNBOUNDED FOLLOWING)
       FROM activity_month;

    Result Type and Format

    The result data type and format for window aggregate functions are as follows.

     

    Function

    Result Type

    Format

    AVG(x)

    where x is a character type

    FLOAT

    Default format for FLOAT

    AVG(x)

    where x is a numeric, DATE, or INTERVAL type

    FLOAT

    Same format as operand x

    CORR(x,y)
    COVAR_POP(x,y)
    COVAR_SAMP(x,y)
    REGR_AVGX(y,x)
    REGR_AVGY(y,x)
    REGR_INTERCEPT(x,y)
    REGR_R2(x,y)
    REGR_SLOPE(x,y)
    REGR_SXX(x,y)
    REGR_SXY(x,y)
    REGR_SYY(x,y)
    STDDEV_POP(x)
    STDDEV_SAMP(x)
    VAR_POP(x)
    VAR_SAMP(x)

    where x is a character type

    FLOAT

    Default format for FLOAT

    CORR(x,y)
    COVAR_POP(x,y)
    COVAR_SAMP(x,y)
    REGR_AVGX(y,x)
    REGR_AVGY(y,x)
    REGR_INTERCEPT(x,y)
    REGR_R2(x,y)
    REGR_SLOPE(x,y)
    REGR_SXX(x,y)
    REGR_SXY(x,y)
    REGR_SYY(x,y)
    STDDEV_POP(x)
    STDDEV_SAMP(x)
    VAR_POP(x)
    VAR_SAMP(x)

    where x is one of the following types:

  • Numeric
  • DATE
  • Interval
  • Same data type as operand x.

    Default format for the data type of operand x

    REGR_AVGX(y,x)
    REGR_AVGY(y, x)

    where x is a UDT

     

     

     

    Default format for the data type to which the UDT is implicitly cast.

    COUNT(x)
    COUNT(*)
    REGR_COUNT(x,y)

    where the transaction mode is ANSI

    If MaxDecimal in DBSControl is…

  • 0 or 15, then the result type is DECIMAL(15,0) and the format is
    -(15)9.
  • 18, then the result type is DECIMAL(18,0) and the format is
    -(18)9.
  • 38, then the result type is DECIMAL(38,0) and the format is
    -(38)9.
  • ANSI transaction mode uses DECIMAL because tables frequently have a cardinality exceeding the range of INTEGER.

    COUNT(x)
    COUNT(*)
    REGR_COUNT(x,y)

    where the transaction mode is Teradata

    INTEGER

    Teradata transaction mode uses INTEGER to avoid regression problems.

    Note: You can cast the final result of a COUNT window aggregate function; however, the cast is not used as part of the window function computation as it is for the COUNT aggregate function and, therefore, cannot be used to avoid numeric overflow errors that might occur during the computation.

    Default format for INTEGER

    MAX(x), MIN(x)

    Same data type as operand x.

    Same format as operand x

    SUM(x)

    where x is a character type

    Same as operand x.

    Default format for FLOAT

    SUM(x)

    where x is a DECIMAL(n,m) type

    DECIMAL(p,m), where p is determined according to the following rules:

    If MaxDecimal in DBSControl is 0 or 15 and

  • n  15, then p = 15.
  • 15 < n  18, p = 18.
  • n > 18, then p = 38.
  • If MaxDecimal in DBSControl is 18 and

  • n  18, then p = 18.
  • n > 18, then p = 38.
  • If MaxDecimal in DBSControl is 38 and n = any value, the p = 38.

    Default format for DECIMAL

     

     

     

     

     

     

     

    SUM(x)

    where x is any numeric type other than DECIMAL

    Same as operand x.

    Default format for the data type of the operand

    For information on the default format of data types and an explanation of the formatting characters in the format, see “Data Type Formats and Format Phrases” in SQL Data Types and Literals.

    Result Title

    The default title that appears in the heading for displayed or printed results depends on the type of computation performed.

     

    IF the type of computation is …

    THEN the result title is …

    cumulative

    Cumulative Function_name (argument_list)

    For example, consider the following computation:

       SELECT AVG(sales) OVER (PARTITION BY region
          ORDER BY smonth ROWS UNBOUNDED PRECEDING)
       FROM sales_history;

    The title that appears in the result heading is:

       Cumulative Avg(sales)
       ---------------------

    group

    Group Function_name (argument_list)

    For example, consider the following computation:

       SELECT AVG(sales) OVER (PARTITION BY region
          ORDER BY smonth ROWS BETWEEN UNBOUNDED
          PRECEDING AND UNBOUNDED FOLLOWING)
       FROM sales_history;

    The title that appears in the result heading is:

       Group Avg(sales)
       ----------------

    moving

    Moving Function_name (argument_list)

    For example, consider the following computation:

       SELECT AVG(sales) OVER (PARTITION BY region
          ORDER BY smonth ROWS 2 PRECEDING)
       FROM sales_history;

    The title that appears in the result heading is:

       Moving Avg(sales)
       -----------------

    remaining

    Remaining Function_name (argument_list)

    For example, consider the following computation:

       SELECT AVG(sales) OVER (PARTITION BY region
          ORDER BY smonth ROWS BETWEEN CURRENT ROW
          AND UNBOUNDED FOLLOWING)
       FROM sales_history;

    The title that appears in the result heading is:

       Remaining Avg(sales)
       --------------------

    Problems With Missing Data

    Ensure that data you analyze has no missing data points. Computing a moving function over data with missing points produces unexpected and incorrect results because the computation considers n physical rows of data rather than n logical data points.

    Nesting Aggregates in Window Functions

    Although you can nest aggregates in window functions, including the select list, HAVING, QUALIFY, and ORDER BY clauses, the HAVING clause can only contain aggregate function references because HAVING cannot contain nested syntax like RANK() OVER (ORDER BY SUM(x)).

    Aggregate functions cannot be specified with Teradata-specific functions.

    Example  

    The following query nests the SUM aggregate function within the RANK ordered analytical function in the select list:

       SELECT state, city, SUM(sale), 
       RANK() OVER (PARTITION BY state ORDER BY SUM(sale))
       FROM T1
       WHERE T1.cityID = T2.cityID
       GROUP BY state, city
       HAVING MAX(sale) > 10;

    Alternative: Using Derived Tables

    Although only window functions allow aggregates specified together in the same SELECT list, window functions and Teradata-specific functions can be combined with aggregates using derived tables or views. Using derived tables or views also clarifies the semantics of the computation.

    Example  

    The following example shows the sales rank of a particular product in a store and its percent contribution to the store sales for the top three products in each store.

       SELECT RT.storeid, RT.prodid, RT.sales,
       RT.rank_sales, RT.sales * 100.0/ST.sum_store_sales
       FROM (SELECT storeid, prodid, sales, RANK(sales) AS rank_sales
       FROM sales_tbl
       GROUP BY storeID
       QUALIFY RANK(sales) <=3) AS RT,
       (SELECT storeID, SUM(sales) AS sum_store_sales
       FROM sales_tbl
       GROUP BY storeID) AS ST
       WHERE RT.storeID = ST.storeID
       ORDER BY RT.storeID, RT.sales;

    The results table might look something like the following:

     

    storeID

    prodID

    sales

    rank_sales

    sales*100.0/sum_store_sales

    1001

    D

    35000.00

    3

    17.949

    1001

    C

    60000.00

    2

    30.769

    1001

    A

    100000.00

    1

    51.282

    1002

    D

    25000.00

    3

    25.000

    1002

    C

    35000.00

    2

    35.000

    1002

    A

    40000.00

    1

    40.000

    1003

    C

    20000.00

    3

    20.000

    1003

    A

    30000.00

    2

    30.000

    1003

    D

    50000.00

    1

    50.000

    ...

    ...

    ...

    ...

     

    Teradata-Specific Alternatives to Ordered Analytical Functions

    Teradata SQL supports two syntax alternatives for ordered analytical functions:

  • Teradata-specific
  • ANSI SQL:2011 compliant
  • Window aggregate, rank, distribution, and row number functions are ANSI SQL:2011 compliant. Teradata-specific functions are not.

    Teradata-Specific Functions and ANSI SQL:2011 Window Functions

    The following table identifies equivalent ANSI SQL:2011 window functions for Teradata-specific functions:

    Note: The use of the Teradata-specific functions listed in the following table is strongly discouraged. These functions are retained only for backward compatibility with existing applications. Be sure to use the ANSI-compliant window functions for any new applications you develop.

     

    Teradata-Specific Functions

    Equivalent ANSI SQL:2011 Window Functions

    CSUM

    SUM

    MAVG

    AVG

    MDIFF(x, w, y)

    composable from SUM

    MLINREG

    composable from SUM and COUNT

    QUANTILE

    composable from RANK and COUNT

    RANK

    RANK

    MSUM

    SUM

    Comparing Window Aggregate Functions and Teradata-Specific Functions

    Avoid using Teradata-specific functions such as MAVG, CSUM, and MSUM for applications intended to be ANSI-compliant and portable.

     

    ANSI Function

    Teradata Function

    Relationship

    AVG

    MAVG

    The form of the AVG window function that specifies an aggregation group of ROWS value PRECEDING is the ANSI equivalent of the MAVG Teradata-specific function.

    Note that the ROWS value PRECEDING phrase specifies the number of rows preceding the current row that are used, together with the current row, to compute the moving average. The total number of rows in the aggregation group is value + 1. For the MAVG function, the total number of rows in the aggregation group is the value of width.

    For AVG window function, an aggregation group of ROWS 5 PRECEDING, for example, means that the 5 rows preceding the current row, plus the current row, are used to compute the moving average. Thus the moving average for the 6th row of a partition would have considered row 6, plus rows 5, 4, 3, 2, and 1 (that is, 6 rows in all).

    For the MAVG function, a width of 5 means that the current row, plus 4 preceding rows, are used to compute the moving average. The moving average for the 6th row would have considered row 6, plus rows 4, 5, 3, and 2 (that is, 5 rows in all).

    SUM

    CSUM
    MSUM

    Be sure to use the ANSI-compliant SUM window function for any new applications you develop. Avoid using CSUM and MSUM for applications intended to be ANSI-compliant and portable.

    The following defines the relationship between the SUM window function and the CSUM and MSUM Teradata-specific functions, respectively:

  • The SUM window function that uses the ORDER BY clause and specifies ROWS UNBOUNDED PRECEDING is the ANSI equivalent of CSUM.
  • The SUM window function that uses the ORDER BY clause and specifies ROWS value PRECEDING is the ANSI equivalent of MSUM.
  • Note that the ROWS value PRECEDING phrase specifies the number of rows preceding the current row that are used, together with the current row, to compute the moving average. The total number of rows in the aggregation group is value + 1. For the MSUM function, the total number of rows in the aggregation group is the value of width.

    Thus for the SUM window function that computes a moving sum, an aggregation group of ROWS 5 PRECEDING means that the 5 rows preceding the current row, plus the current row, are used to compute the moving sum. The moving sum for the 6th row of a partition, for example, would have considered row 6, plus rows 5, 4, 3, 2, and 1 (that is, 6 rows in all).

    For the MSUM function, a width of 5 means that the current row, plus 4 preceding rows, are used to compute the moving sum. The moving sum for the 6th row, for example, would have considered row 6, plus rows 5, 4, 3, and 2 (that is, 5 rows in all).

    Moreover, for data having fewer than width rows, MSUM computes the sum using all the preceding rows. MSUM returns the current sum rather than nulls when the number of rows in the sample is fewer than width.

    Example : Moving Average

    Determine, for a business with several sales territories, the sales in each territory averaged over the current month and the preceding 2 months.

    The following query might return the results found in the table that follows it.

       SELECT territory, smonth, sales,
       AVG(sales) OVER (PARTITION BY territory 
                        ORDER BY smonth 
                        ROWS 2 PRECEDING)
       FROM sales_history;
     
       territory  smonth  sales  Moving Avg(sales)
       ---------  ------  -----  -----------------
       East       199810  10     10
       East       199811  4      7
       East       199812  10     8
       East       199901  7      7
       East       199902  10     9
       West       199810  8      8
       West       199811  12     10
       West       199812  7      9
       West       199901  11     10
       West       199902  6      8

    The meanings of the phrases in the example query are as follows:

     

    Phrase

    Meaning

    PARTITION BY

    Indicates that the rows delivered by the FROM clause, the rows of sales_history, should be assigned to groups, or partitions, based on their territory. If no PARTITION clause is specified, then the entire result set constitutes a single group, or partition.

    ORDER BY

    Indicates that rows are sorted in ascending order of month within each group, or partition. Ascending is the default sort order.

    ROWS 2 PRECEDING

    Defines the number of rows used to compute the moving average. In this case, the computation uses the current row and the 2 preceding rows of the group, or partition, as available.

    Thus, the moving average for the first row of the partition East (199810), which has no preceding rows, is 10. That is, the value of the first row, the current row (10)/ the number of rows (1) = 10.

    The moving average for the second row of the partition East (199811), which has only 1 preceding row, is 7. That is, the value of the second row, the current row, and the preceding row (10 + 4) / the number of rows (2) = 7.

    The moving average for the third row of the partition East (199812), which has 2 preceding rows, is 8. That is, the value of the third row, the current row, and the 2 preceding rows (10 + 4 + 10) / the number of rows (3) = 8. And so on.

    Month is specified as a six-digit numeric in the YYYYMM format.

    Example : Group Count

    The following SQL query might yield the results that follow it, where the group count for sales is returned for each of the four partitions defined by city and kind. Notice that rows that have no sales are not counted.

       SELECT city, kind, sales, profit,
       COUNT(sales) OVER (PARTITION BY city, kind 
                          ROWS BETWEEN UNBOUNDED PRECEDING AND
                          UNBOUNDED FOLLOWING)
       FROM activity_month;
       
       city     kind      sales  profit  Group Count(sales)
       -------  --------  -----  ------  ------------------
       LA       Canvas       45     320                   4
       LA       Canvas      125     190                   4
       LA       Canvas      125     400                   4
       LA       Canvas       20     120                   4
       LA       Leather      20      40                   1
       LA       Leather       ?       ?                   1
       Seattle  Canvas       15      30                   3
       Seattle  Canvas       20      30                   3
       Seattle  Canvas       20     100                   3
       Seattle  Leather      35      50                   1
       Seattle  Leather       ?       ?                   1

    Example : Remaining Count

    To count all the rows, including rows that have no sales, use COUNT(*). Here is an example that counts the number of rows remaining in the partition after the current row:

       SELECT city, kind, sales, profit,
       COUNT(*) OVER (PARTITION BY city, kind ORDER BY profit DESC
                      ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
       FROM activity_month;
       
       city     kind      sales  profit  Remaining Count(*)
       -------  --------  -----  ------  ------------------
       LA       Canvas       20     120                   ?
       LA       Canvas      125     190                   1
       LA       Canvas       45     320                   2
       LA       Canvas      125     400                   3
       LA       Leather       ?       ?                   ?
       LA       Leather      20      40                   1
       Seattle  Canvas       15      30                   ?
       Seattle  Canvas       20      30                   1
       Seattle  Canvas       20     100                   2
       Seattle  Leather       ?       ?                   ?
       Seattle  Leather      35      50                   1

    Note that the sort order that you specify in the window specification defines the sort order of the rows over which the function is applied; it does not define the ordering of the results.

    In the example, the DESC sort order is specified for the computation, but the results are returned in the reverse order.

    To order the results, use the ORDER BY phrase in the SELECT statement:

       SELECT city, kind, sales, profit,
       COUNT(*) OVER (PARTITION BY city, kind ORDER BY profit DESC
                      ROWS BETWEEN 1 FOLLOWING AND 
                      UNBOUNDED FOLLOWING)
       FROM activity_month
       ORDER BY city, kind, profit DESC;
     
       
       city     kind      sales  profit  Remaining Count(*)
       -------  --------  -----  ------  ------------------
       LA       Canvas      125     400                   3
       LA       Canvas       45     320                   2
       LA       Canvas      125     190                   1
       LA       Canvas       20     120                   ?
       LA       Leather      20      40                   1
       LA       Leather       ?       ?                   ?
       Seattle  Canvas       20     100                   2
       Seattle  Canvas       20      30                   1
       Seattle  Canvas       15      30                   ?
       Seattle  Leather      35      50                   1
       Seattle  Leather       ?       ?                   ?

    Example : Cumulative Maximum

    The following SQL query might yield the results that follow it, where the cumulative maximum value for sales is returned for each partition defined by city and kind.

       SELECT city, kind, sales, week, 
       MAX(sales) OVER (PARTITION BY city, kind 
                        ORDER BY week ROWS UNBOUNDED PRECEDING)
       FROM activity_month;
     
       
       city     kind      sales  week  Cumulative Max(sales)
       -------  --------  -----  ----  ---------------------
       LA       Canvas      263    16                    263
       LA       Canvas      294    17                    294
       LA       Canvas      321    18                    321
       LA       Canvas      274    20                    321
       LA       Leather     144    16                    144
       LA       Leather     826    17                    826
       LA       Leather     489    20                    826
       LA       Leather     555    21                    826
       Seattle  Canvas      100    16                    100
       Seattle  Canvas      182    17                    182
       Seattle  Canvas       94    18                    182
       Seattle  Leather     933    16                    933
       Seattle  Leather     840    17                    933
       Seattle  Leather     899    18                    933
       Seattle  Leather     915    19                    933
       Seattle  Leather     462    20                    933
       

    Example : Cumulative Minimum

    The following SQL query might yield the results that follow it, where the cumulative minimum value for sales is returned for each partition defined by city and kind.

       SELECT city, kind, sales, week, 
       MIN(sales) OVER (PARTITION BY city, kind
                        ORDER BY week 
                        ROWS UNBOUNDED PRECEDING) 
       FROM activity_month;
       
       city     kind      sales  week  Cumulative Min(sales)
       -------  --------  -----  ----  ---------------------
       LA       Canvas      263    16                    263
       LA       Canvas      294    17                    263
       LA       Canvas      321    18                    263
       LA       Canvas      274    20                    263
       LA       Leather     144    16                    144
       LA       Leather     826    17                    144
       LA       Leather     489    20                    144
       LA       Leather     555    21                    144
       Seattle  Canvas      100    16                    100
       Seattle  Canvas      182    17                    100
       Seattle  Canvas       94    18                     94
       Seattle  Leather     933    16                    933
       Seattle  Leather     840    17                    840
       Seattle  Leather     899    18                    840
       Seattle  Leather     915    19                    840
       Seattle  Leather     462    20                    462

     

    Example : Cumulative Sum

    The following query returns the cumulative balance per account ordered by transaction date:

       SELECT acct_number, trans_date, trans_amount,
       SUM(trans_amount) OVER (PARTITION BY acct_number 
                               ORDER BY trans_date 
                               ROWS UNBOUNDED PRECEDING) as balance
       FROM ledger
       ORDER BY acct_number, trans_date;

    Here are the possible results of the preceding SELECT:

     

    acct_number

    trans_date

    trans_amount

    balance

    73829

    1998-11-01

    113.45

    113.45

    73829

    1988-11-05

    -52.01

    61.44

    73929

    1998-11-13

    36.25

    97.69

    82930

    1998-11-01

    10.56

    10.56

    82930

    1998-11-21

    32.55

    43.11

    82930

    1998-11-29

    -5.02

    38.09

    Example : Group Sum

    The query below finds the total sum of meat sales for each city.

       SELECT city, kind, sales,
       SUM(sales) OVER (PARTITION BY city ROWS BETWEEN UNBOUNDED PRECEDING
       AND UNBOUNDED FOLLOWING) FROM monthly;

    The possible results of the preceding SELECT appear in the following table:

     

    city

    kind

    sales

    Group Sum (sales)

    Omaha

    pure pork

    45

    220

    Omaha

    pure pork

    125

    220

    Omaha

    pure pork

    25

    220

    Omaha

    variety pack

    25

    220

    Chicago

    variety pack

    55

    175

    Chicago

    variety pack

    45

    175

    Chicago

    pure pork

    50

    175

    Chicago

    variety pack

    25

    175

    Example : Group Sum

    The following query returns the total sum of meat sales for all cities. Note there is no PARTITION BY clause in the SUM function, so all cities are included in the group sum.

       SELECT city, kind, sales,
       SUM(sales) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND
                        UNBOUNDED FOLLOWING) 
       FROM monthly;

    The possible results of the preceding SELECT appear in the table below:

     

    city

    kind

    sales

    Group Sum (sales)

    Omaha

    pure pork

    45

    395

    Omaha

    pure pork

    125

    395

    Omaha

    pure pork

    25

    395

    Omaha

    variety pack

    25

    395

    Chicago

    variety pack

    55

    395

    Chicago

    variety pack

    45

    395

    Chicago

    pure pork

    50

    395

    Chicago

    variety pack

    25

    395

    Example : Moving Sum

    The following query returns the moving sum of meat sales by city. Notice that the query returns the moving sum of sales by city (the partition) for the current row (of the partition) and three preceding rows where possible.

    The order in which each meat variety is returned is the default ascending order according to profit.

    Where no sales figures are available, no moving sum of sales is possible. In this case, there is a null in the sum(sales) column.

       SELECT city, kind, sales, profit,
       SUM(sales) OVER (PARTITION BY city, kind
                        ORDER BY profit ROWS 3 PRECEDING)
       FROM monthly;
     

    city

    kind

    sales

    profit

    Moving sum (sales)

    Omaha

    pure pork

    25

    40

    25

    Omaha

    pure pork

    25

    120

    50

    Omaha

    pure pork

    45

    140

    95

    Omaha

    pure pork

    125

    190

    220

    Omaha

    pure pork

    45

    320

    240

    Omaha

    pure pork

    1255

    400

    340

    Omaha

    variety pack

    ?

    ?

    ?

    Omaha

    variety pack

    25

    40

    25

    Omaha

    variety pack

    25

    120

    50

    Chicago

    pure pork

    ?

    ?

    ?

    Chicago

    pure pork

    15

    10

    15

    Chicago

    pure pork

    54

    12

    69

    Chicago

    pure pork

    14

    20

    83

    Chicago

    pure pork

    54

    24

    137

    Chicago

    pure pork

    14

    34

    136

    Chicago

    pure pork

    95

    80

    177

    Chicago

    pure pork

    95

    140

    258

    Chicago

    pure pork

    15

    220

    219

    Chicago

    variety pack

    23

    39

    23

    Chicago

    variety pack

    25

    40

    48

    Chicago

    variety pack

    125

    70

    173

    Chicago

    variety pack

    125

    100

    298

    Chicago

    variety pack

    23

    100

    298

    Chicago

    variety pack

    25

    120

    298

    Example : Remaining Sum

    The following query returns the remaining sum of meat sales for all cities. Note there is no PARTITION BY clause in the SUM function, so all cities are included in the remaining sum.

       SELECT city, kind, sales,
       SUM(sales) OVER (ORDER BY city, kind
                        ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
       FROM monthly;
     

    The possible results of the preceding SELECT appear in the table below:

    city     kind           sales    Remaining Sum(sales)
    -------  -------------  -------  --------------------
    Omaha    variety pack   25       ?
    Omaha    pure pork      125      25
    Omaha    pure pork      25       150
    Omaha    pure pork      45       175
    Chicago  variety pack   55       220
    Chicago  variety pack   25       275
    Chicago  variety pack   45       300
    Chicago  pure pork      50       345
     

    Note that the sort order for the computation is alphabetical by city, and then by kind. The results, however, appear in the reverse order.

    The sort order that you specify in the window specification defines the sort order of the rows over which the function is applied; it does not define the ordering of the results. To order the results, use an ORDER BY phrase in the SELECT statement.

    For example:

       SELECT city, kind, sales,
       SUM(sales) OVER (ORDER BY city, kind
                        ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED FOLLOWING)
       FROM monthly
       ORDER BY city, kind;
     

    The possible results of the preceding SELECT appear in the table below:

    city     kind           sales    Remaining Sum(sales)
    -------  -------------  -------  --------------------
    Chicago  pure pork      50       345
    Chicago  variety pack   55       265
    Chicago  variety pack   25       320
    Chicago  variety pack   45       220
    Omaha    pure pork      25       70
    Omaha    pure pork      125      95
    Omaha    pure pork      45       25
    Omaha    variety pack   25       ?

    GROUP BY Clause

    GROUP BY and Window Functions

    For window functions, the GROUP BY clause must include all the columns specified in the:

  • Select list of the SELECT clause
  • Window functions in the select list of a SELECT clause
  • Window functions in the search condition of a QUALIFY clause
  • The condition in the RESET WHEN clause
  • For example, the following SELECT statement specifies the column City in the select list and the column StoreID in the COUNT window function in the select list and QUALIFY clause. Both columns must also appear in the GROUP BY clause:

       SELECT City, StoreID, COUNT(StoreID) OVER ()
       FROM sales_tbl 
       GROUP BY City, StoreID
       QUALIFY COUNT(StoreID) >=3;

    For window functions, GROUP BY collapses all rows with the same value for the group-by columns into a single row.

    For example, the following statement uses the GROUP BY clause to collapse all rows with the same value for City and StoreID into a single row:

       SELECT City, StoreID, COUNT(StoreID) OVER ()
       FROM sales_tbl 
       GROUP BY City, StoreID;

    The results look like this:

       City   StoreID  Group Count(StoreID)
       -----  -------  --------------------
       Pecos     1001                     3
       Pecos     1002                     3
       Ozona     1003                     3

    Without the GROUP BY, the results look like this:

       City   StoreID  Group Count(StoreID)
       -----  -------  --------------------
       Pecos     1001                     9
       Pecos     1001                     9
       Pecos     1001                     9
       Pecos     1001                     9
       Pecos     1002                     9
       Pecos     1002                     9
       Pecos     1002                     9
       Ozona     1003                     9
       Ozona     1003                     9

    GROUP BY and Teradata-Specific Functions

    For Teradata-specific functions, GROUP BY determines the partitions over which the function executes. The clause does not collapse all rows with the same value for the group-by columns into a single row. Thus, the GROUP BY clause in these cases need only specify the partitioning column for the function.

    For example, the following statement computes the running sales for each store by using the GROUP BY clause to partition the data in sales_tbl by StoreID:

       SELECT StoreID, Sales, CSUM(Sales, StoreID)
       FROM sales_tbl 
       GROUP BY StoreID;

    The results look like this:

       StoreID     Sales  CSum(Sales,StoreID)
       -------  --------  -------------------
          1001   1100.00              1100.00
          1001    400.00              1500.00
          1001   1000.00              2500.00
          1001   2000.00              4500.00
          1002    500.00               500.00
          1002   1500.00              2000.00
          1002   2500.00              4500.00
          1003   1000.00              1000.00
          1003   3000.00              4000.00

    Combining Window Functions, Teradata-Specific Functions, and GROUP BY

    The following table provides the semantics of the allowable combinations of window functions, Teradata-specific functions, aggregate functions, and the GROUP BY clause.

     

     

     

    Combination

     

    Semantics

    Window Function

    Teradata-Specific Function

    Aggregate Function

    GROUP BY Clause

     

    X

     

     

     

    A value is computed for each row.

     

    X

     

     

    A value is computed for each row. The entire table constitutes a single group, or partition, over which the Teradata-specific function executes.

     

     

    X

     

    One aggregate value is computed for the entire table.

    X

     

     

    X

    GROUP BY collapses all rows with the same value for the group-by columns into a single row, and a value is computed for each resulting row.

     

    X

     

    X

    GROUP BY determines the partitions over which the Teradata-specific function executes. The clause does not collapse all rows with the same value for the group-by columns into a single row.

     

     

    X

    X

    An aggregation is performed for each group.

    X

    X

     

     

    Teradata-specific functions do not have partitions. The whole table is one partition.

    X

    X

     

    X

    GROUP BY determines partitions for Teradata-specific functions. GROUP BY does not collapse all rows with the same value for the group-by columns into a single row, and does not affect window function computation.

    X

     

    X

    X

    GROUP BY collapses all rows with the same value for the group-by columns into a single row. For window functions, a value is computed for each resulting row; for aggregate functions, an aggregation is performed for each group.

    Possible Result Overflow with SELECT Sum

    When using this function, the result can create an overflow when the data type and format are not in sync. For a column defined as:

    Salary Decimal(15,2) Format ‘$ZZZ,ZZ9.99’

    The following query:

    SELECT SUM (Salary) FROM Employee;

    causes an overflow because the decimal operand and the format are not in sync.

    To avoid possible overflows, explicitly specify the format for decimal sum to specify a format large enough to accommodate the decimal sum resultant data type.

    SELECT Sum(Salary) (format ‘$Z,ZZZ,ZZZ,ZZ9.99) FROM Employee;