The Window Feature - Teradata Database

SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Language
English (United States)
Last Update
2018-09-24
dita:id
B035-1145
lifecycle
previous
Product Category
Teradata® Database

The Window Feature

The ANSI SQL:2011 window feature provides a way to dynamically define a subset of data, or window, in an ordered relational database table. A window is specified by the OVER() phrase, which can include the following clauses inside the parentheses:

  • PARTITION BY
  • ORDER BY
  • RESET WHEN
  • ROWS
  • To see the syntax for the OVER() phrase and the associated clauses, see “Window Aggregate Functions” on page 984.

    The window specification can also be applied to a user-defined aggregate function. For details, see “Window Aggregate UDF” on page 1330.

    PARTITION BY Phrase

    PARTITION BY takes a column reference list and groups the rows based on the specified column reference list over which the ordered analytical function executes. Such a grouping is static. To define a group or partition based on a condition, use the RESET WHEN phrase. See “RESET WHEN Phrase” on page 979 for details.

    If there is no PARTITION BY phrase or RESET WHEN phrase, then the entire result set, delivered by the FROM clause, constitutes a single partition, over which the ordered analytical function executes.

    Consider the following table named sales_tbl.

     

    StoreID

    SMonth

    ProdID

    Sales

    1001

    1

    C

    35000.00

    1001

    2

    C

    25000.00

    1001

    3

    C

    40000.00

    1001

    4

    C

    25000.00

    1001

    5

    C

    30000.00

    1001

    6

    C

    30000.00

    1002

    1

    C

    40000.00

    1002

    2

    C

    35000.00

    1002

    3

    C

    110000.00

    1002

    4

    C

    60000.00

    1002

    5

    C

    35000.00

    1002

    6

    C

    100000.00

    The following SELECT statement, which does not include PARTITION BY, computes the average sales for all the stores in the table:

       SELECT StoreID, SMonth, ProdID, Sales, 
       AVG(Sales) OVER () 
       FROM sales_tbl;
       
       StoreID  SMonth  ProdID      Sales  Group Avg(Sales)
       -------  ------  ------  ---------  ----------------
          1001       1  C        35000.00          47083.33
          1001       2  C        25000.00          47083.33
          1001       3  C        40000.00          47083.33
          1001       4  C        25000.00          47083.33
          1001       5  C        30000.00          47083.33
          1001       6  C        30000.00          47083.33
          1002       1  C        40000.00          47083.33
          1002       2  C        35000.00          47083.33
          1002       3  C       110000.00          47083.33
          1002       4  C        60000.00          47083.33
          1002       5  C        35000.00          47083.33
          1002       6  C       100000.00          47083.33
       

    To compute the average sales for each store, partition the data in sales_tbl by StoreID:

       SELECT StoreID, SMonth, ProdID, Sales, 
       AVG(Sales) OVER (PARTITION BY StoreID)
       FROM sales_tbl;
       
       StoreID  SMonth  ProdID      Sales  Group Avg(Sales)
       -------  ------  ------  ---------  ----------------
          1001       3  C        40000.00          30833.33
          1001       5  C        30000.00          30833.33
          1001       6  C        30000.00          30833.33
          1001       4  C        25000.00          30833.33
          1001       2  C        25000.00          30833.33
          1001       1  C        35000.00          30833.33
          1002       3  C       110000.00          63333.33
          1002       5  C        35000.00          63333.33
          1002       6  C       100000.00          63333.33
          1002       4  C        60000.00          63333.33
          1002       2  C        35000.00          63333.33
          1002       1  C        40000.00          63333.33
       

    ORDER BY Phrase

    ORDER BY specifies how the rows are ordered in a partition, which determines the sort order of the rows over which the function is applied.

    To add the monthly sales for a store in the sales_tbl table to the sales for previous months, compute the cumulative sales sum and order the rows in each partition by SMonth:

       SELECT StoreID, SMonth, ProdID, Sales,
       SUM(Sales) OVER (PARTITION BY StoreID ORDER BY SMonth 
                  ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
       FROM sales_tbl;
       
       StoreID  SMonth  ProdID      Sales  Cumulative Sum(Sales)
       -------  ------  ------  ---------  ---------------------
          1001       1  C        35000.00               35000.00
          1001       2  C        25000.00               60000.00
          1001       3  C        40000.00              100000.00
          1001       4  C        25000.00              125000.00
          1001       5  C        30000.00              155000.00
          1001       6  C        30000.00              185000.00
          1002       1  C        40000.00               40000.00
          1002       2  C        35000.00               75000.00
          1002       3  C       110000.00              185000.00
          1002       4  C        60000.00              245000.00
          1002       5  C        35000.00              280000.00
          1002       6  C       100000.00              380000.00

    RESET WHEN Phrase

    RESET WHEN is a Teradata extension to the ANSI SQL standard.

    Depending on the evaluation of the specified condition, RESET WHEN determines the group or partition, over which the ordered analytical function operates. If the condition evaluates to TRUE, a new dynamic partition is created inside the specified window partition. To define a partition based on a column reference list, use the PARTITION BY phrase. See “PARTITION BY Phrase” on page 977 for details.

    If there is no RESET WHEN phrase or PARTITION BY phrase, then the entire result set, delivered by the FROM clause, constitutes a single partition, over which the ordered analytical function executes.

    You can have different RESET WHEN clauses in the same SELECT list.

    Note: A window specification that specifies a RESET WHEN clause must also specify an ORDER BY clause.

    RESET WHEN Condition Rules

    The condition in the RESET WHEN clause is equivalent in scope to the condition in a QUALIFY clause with the additional constraint that nested ordered analytical functions cannot specify conditional partitioning.

    The condition is applied to the rows in all designated window partitions to create sub-partitions within the particular window partitions.

    The following rules apply for RESET WHEN conditions.

    A RESET WHEN condition can contain the following:

  • Ordered analytical functions that do not include the RESET WHEN clause
  • Scalar subqueries
  • Aggregate operators
  • DEFAULT functions
  • However, DEFAULT without an explicit column specification is valid only if it is specified as a standalone condition in the predicate. See “Rules For Using a DEFAULT Function As Part of a RESET WHEN Condition” on page 980 for details.

    A RESET WHEN condition cannot contain the following:

  • Ordered analytical functions that include the RESET WHEN clause
  • The SELECT statement
  • LOB columns
  • UDT expressions, including UDFs that return a UDT value
  • However, a RESET WHEN condition can include an expression that contains UDTs as long as that expression returns a result that has a predefined data type.

    Rules For Using a DEFAULT Function As Part of a RESET WHEN Condition

    The following rules apply to the use of the DEFAULT function as part of a RESET WHEN condition:

  • You can specify a DEFAULT function with a column name argument within a predicate. The system evaluates the DEFAULT function to the default value of the column specified as its argument. After the system evaluates the DEFAULT function, it treats it like a literal in the predicate.
  • You can specify a DEFAULT function without a column name argument within a predicate only if there is one column specification and one DEFAULT function as the terms on each side of the comparison operator within the expression.
  • Following existing comparison rules, a condition with a DEFAULT function used with comparison operators other than IS [NOT] NULL is unknown if the DEFAULT function evaluates to null.
  • A condition other than IS [NOT]NULL with a DEFAULT function compared with a null evaluates to unknown.

     

    IF a DEFAULT function is used with...

    THEN the comparison is...

    IS NULL

    TRUE if the default is null,

    else it is FALSE.

    IS NOT NULL

    FALSE if the default is null,

    else it is TRUE.

    See “DEFAULT” on page 291 for more information about the DEFAULT function.

    Example  

    This example finds cumulative sales for all periods of increasing sales for each region.

       SUM(sales) OVER (
            PARTITION BY region
            ORDER BY day_of_calendar
            RESET WHEN sales < /* preceding row */ SUM(sales) OVER (
                 PARTITION BY region
                 ORDER BY day_of_calendar
                 ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING)
            ROWS UNBOUNDED PRECEDING
       )

    Example  

    This example finds sequences of increasing balances. This implies that we reset whenever the current balance is less than or equal to the preceding balance.

       SELECT account_key, month, balance,
       ROW_NUMBER() over
            (PARTITION BY account_key
             ORDER BY month
             RESET WHEN balance /* current row balance */ <=
             SUM(balance) over (PARTITION BY account_key ORDER BY month
             ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) /* prev row */
             ) - 1 /* to get the count started at 0 */ as balance_increase
       FROM accounts;

    The possible results of the preceding SELECT appear in the table below:

    account_key        month      balance  balance_increase
    -----------        -----      -------  ----------------
              1            1           60                 0
              1            2           99                 1
              1            3           94                 0
              1            4           90                 0
              1            5           80                 0
              1            6           88                 1
              1            7           90                 2
              1            8           92                 3
              1            9           10                 0
              1           10           60                 1
              1           11           80                 2
              1           12           10                 0

    Example  

    The following example illustrates a window function with a nested aggregate. The query is processed as follows:

    1 We use the SUM(balance) aggregate function to calculate the sum of all the balances for a given account in a given quarter.

    2 We check to see if a balance in a given quarter (for a given account) is greater than the balance of the previous quarter.

    3 If the balance increased, we track a cumulative count value. As long as the RESET WHEN condition evaluates to false, the balance is increasing over successive quarters, and we continue to increase the count.

    4 We use the ROW_NUMBER() ordered analytical function to calculate the count value. When we reach a quarter whose balance is less than or equal to that of the previous quarter, the RESET WHEN condition evaluates to true, and we start a new partition and ROW_NUMBER() restarts the count from 1. We specify ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING to access the previous value.

    5 Finally, we subtract 1 to ensure that the count values start with 0.

    The balance_increase column shows the number of successive quarters where the balance was increasing. In this example, we only have one quarter (1->2) where the balance has increased.

       SELECT account_key, quarter, sum(balance),
       ROW_NUMBER() over
            (PARTITION BY account_key
             ORDER BY quarter
             RESET WHEN sum(balance) /* current row balance */ <=
             SUM(sum(balance)) over (PARTITION BY account_key ORDER BY quarter
             ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING)/* prev row */
             ) - 1 /* to get the count started at 0 */ as balance_increase
       FROM accounts
       GROUP BY account_key, quarter;

    The possible results of the preceding SELECT appear in the table below:

    account_key    quarter    balance    balance_increase
    -----------    -------    -------    ----------------
              1          1        253                   0
              1          2        258                   1
              1          3        192                   0
              1          4        150                   0

    Example  

    In the following example, the condition in the RESET WHEN clause contains SELECT as a nested subquery. This is not allowed and results in an error.

       SELECT SUM(a1) OVER
            (ORDER BY 1
             RESET WHEN 1 in (SELECT 1))
       FROM t1;
       $
       *** Failure 3706 Syntax error: SELECT clause not supported in
       RESET...WHEN clause.

    ROWS Phrase

    ROWS defines the rows over which the aggregate function is computed for each row in the partition.

    If ROWS is specified, the computation of the aggregate function for each row in the partition includes only the subset of rows in the ROWS phrase.

    If there is no ROWS phrase, then the computation includes all the rows in the partition.

    To compute the three-month moving average sales for each store in the sales_tbl table, partition by StoreID, order by SMonth, and perform the computation over the current row and the two preceding rows:

       SELECT StoreID, SMonth, ProdID, Sales,
       AVG(Sales) OVER (PARTITION BY StoreID 
                        ORDER BY SMonth
                        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
       FROM sales_tbl;
       
       StoreID  SMonth  ProdID      Sales  Moving Avg(Sales)
       -------  ------  ------  ---------  -----------------
          1001       1  C        35000.00           35000.00
          1001       2  C        25000.00           30000.00
          1001       3  C        40000.00           33333.33
          1001       4  C        25000.00           30000.00
          1001       5  C        30000.00           31666.67
          1001       6  C        30000.00           28333.33
          1002       1  C        40000.00           40000.00
          1002       2  C        35000.00           37500.00
          1002       3  C       110000.00           61666.67
          1002       4  C        60000.00           68333.33
          1002       5  C        35000.00           68333.33
          1002       6  C       100000.00           65000.00
       

    Multiple Window Specifications

    In an SQL statement using more than one window function, each window function can have a unique window specification.

    For example,

       SELECT StoreID, SMonth, ProdID, Sales,
       AVG(Sales) OVER (PARTITION BY StoreID 
                        ORDER BY SMonth
                        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),
       RANK() OVER (PARTITION BY StoreID ORDER BY Sales DESC)
       FROM sales_tbl;