15.00 - Window Aggregate UDF - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)

Window Aggregate UDF

Purpose  

Allows an aggregate UDF with a window specification applied to it to operate on a specified window of rows.

Syntax  

where:

 

Syntax element …

Specifies …

udf_name

the name of the aggregate UDF on which the window specification is applied.

argument

a valid SQL expression. For rules that apply to aggregate UDF arguments, see “Aggregate UDF” on page 1327.

OVER

how values are grouped, ordered, and considered when computing the cumulative, group, or moving function.

Values are grouped according to the PARTITION BY and RESET WHEN clauses, sorted according to the ORDER BY clause, and considered according to the aggregation group within the partition.

PARTITION BY

in its column_reference, or comma-separated list of column references, the group, or groups, over which the function operates.

PARTITION BY is optional. If there are no PARTITION BY or RESET WHEN clauses, then the entire result set, delivered by the FROM clause, constitutes a single group, or partition.

PARTITION BY clause is also called the window partition clause.

ORDER BY

in its value_expression the order in which the values in a group, or partition, are sorted.

ASC

ascending sort order.

The default is ASC.

DESC

descending sort order.

RESET WHEN

the group or partition, over which the function operates, depending on the evaluation of the specified condition. If the condition evaluates to TRUE, a new dynamic partition is created inside the specified window partition.

RESET WHEN is optional. If there are no RESET WHEN or PARTITION BY clauses, then the entire result set, delivered by the FROM clause, constitutes a single partition.

If RESET WHEN is specified, then the ORDER BY clause must be specified also.

condition

a conditional expression used to determine conditional partitioning. The condition in the RESET WHEN clause is equivalent in scope to the condition in a QUALIFY clause with the additional constraint that nested ordered analytical functions cannot specify a RESET WHEN clause. In addition, you cannot specify SELECT as a nested subquery within the condition.

The condition is applied to the rows in all designated window partitions to create sub-partitions within the particular window partitions.

For more information, see “RESET WHEN Condition Rules” on page 980 and the “QUALIFY Clause” in SQL Data Manipulation Language.

ROWS

the starting point for the aggregation group within the partition. The aggregation group end is the current row.

The aggregation group of a row R is a set of rows, defined relative to R in the ordering of the rows within the partition.

If there is no ROWS or ROWS BETWEEN clause, the default aggregation group is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

ROWS BETWEEN

the aggregation group start and end, which defines a set of rows relative to the current row in the ordering of the rows within the partition.

The row specified by the group start must precede the row specified by the group end.

If there is no ROWS or ROWS BETWEEN clause, the default aggregation group is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

UNBOUNDED PRECEDING

the entire partition preceding the current row.

UNBOUNDED FOLLOWING

the entire partition following the current row.

CURRENT ROW

the start or end of the aggregation group as the current row.

value PRECEDING

the number of rows preceding the current row.

The value for value is always a positive integer literal.

The maximum number of rows in an aggregation group is 4096 when value PRECEDING appears as the group start or group end.

value FOLLOWING

the number of rows following the current row.

The value for value is always a positive integer literal.

The maximum number of rows in an aggregation group is 4096 when value FOLLOWING appears as the group start or group end.

ANSI Compliance

Window aggregate UDFs are partially ANSI SQL:2011 compliant.

The requirement that parentheses appear when the argument list of an aggregate UDF is empty is a Teradata extension to preserve compatibility with existing applications.

In the presence of an ORDER BY clause and the absence of a ROWS or ROWS BETWEEN clause, ANSI SQL:2011 window aggregate functions use ROWS UNBOUNDED PRECEDING as the default aggregation group, whereas Teradata SQL window aggregate functions use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

The RESET WHEN clause is a Teradata extension to the ANSI SQL standard.

Authorization

You must have EXECUTE FUNCTION privileges on the function or on the database containing the function.

To invoke an aggregate UDF that takes a UDT argument or returns a UDT, you must have the UDTUSAGE privilege on the SYSUDTLIB database or on the specified UDT.

Arguments to Window Aggregate UDFs

Window aggregate UDFs can take literals, literal expressions, column names (sales, for example), or column expressions (sales + profit) as arguments.

Window aggregates can also take regular aggregates as input parameters to the PARTITION BY and ORDER BY clauses. The RESET WHEN clause can take an aggregate as part of the RESET WHEN condition clause.

The rules that apply to the arguments of the window aggregate UDF are the same as those that apply to aggregate UDF arguments, see “Aggregate UDF” on page 1327.

Supported Window Types for Aggregate UDFs

 

Window Type

Aggregation Group

Supported Partitioning Strategy

Reporting window

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

Hash partitioning

Cumulative window

  • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  • ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  • Hash partitioning

    Moving window

  • ROWS BETWEEN value PRECEDING AND CURRENT ROW
  • ROWS BETWEEN CURRENT ROW AND value FOLLOWING
  • ROWS BETWEEN value PRECEDING AND value FOLLOWING
  • ROWS BETWEEN value PRECEDING AND value PRECEDING
  • ROWS BETWEEN value FOLLOWING AND value FOLLOWING
  • Hash partitioning and range partitioning

    Consider the following table definition:

       CREATE TABLE t (id INTEGER, v INTEGER);

    The following query specifies a reporting window of rows which the window aggregate UDF MYSUM operates on:

       SELECT id, v, MYSUM(v) OVER (PARTITION BY id ORDER BY v) 
       FROM t;

    The following query specifies a cumulative window of rows which the window aggregate UDF MYSUM operates on:

       SELECT id, v, MYSUM(v) OVER (PARTITION BY id ORDER BY v 
                                    ROWS UNBOUNDED PRECEDING) 
       FROM t;

    The following query specifies a moving window of rows which the window aggregate UDF MYSUM operates on:

      SELECT id, v, MYSUM(v) OVER (PARTITION BY id ORDER BY v 
                                   ROWS BETWEEN 2 PRECEDING AND 3 FOLLOWING)
      FROM t;

    Unsupported Window Types for Aggregate UDFs

     

    Window Type

    Aggregation Group

    Moving window

  • ROWS BETWEEN UNBOUNDED PRECEDING AND value FOLLOWING
  • ROWS BETWEEN value PRECEDING AND UNBOUNDED FOLLOWING
  • Partitioning

    The range partitioning strategy helps to avoid hot AMP situations where the values of the columns of the PARTITION BY clause result in the distribution of too many rows to the same partition or AMP.

    Range and hash partitioning is supported for moving window types. Only hash partitioning is supported for the reporting and cumulative window types because of potential ambiguities that can occur when a user tries to reference previous values assuming a specific ordering within window types like reporting and cumulative, which are semantically not order dependant.

    You should use an appropriate set of column values for the PARTITION BY clause to avoid potential skew situations for the reporting or cumulative aggregate cases. For more information, see “Data in Partitioning Column of Window Specification and Resource Impact” on page 974.

    Result Type and Format

    The result data type of a window aggregate UDF is based on the return type of the aggregate UDF, which is specified in the RETURNS clause of the CREATE FUNCTION statement.

    The default format of a window aggregate UDF is the default format for the return type. For information on the default format of data types and an explanation of the formatting characters in the format, see “Data Type Formats and Format Phrases” in SQL Data Types and Literals.

    Usage Notes  

    You can apply a window specification to an aggregate UDF. The window feature provides a way to dynamically define a subset of data, or window, and allows the aggregate function to operate on that window of rows. Without a window specification, aggregate functions return one value for all qualified rows examined, but window aggregate functions return a new value for each of the qualifying rows participating in the query.

    Problems With Missing Data

    Ensure that data you analyze has no missing data points. Computing a moving function over data with missing points produces unexpected and incorrect results because the computation considers n physical rows of data rather than n logical data points.

    Restrictions

  • The window feature is not supported for aggregate UDFs written in Java.
  • Range partitioning for the reporting or cumulative window types is not supported.
  • Any restrictions that apply to aggregate UDFs also apply to window aggregate UDFs.
  • Any restrictions that apply to the window specification of a standard SQL aggregate function also apply to the window specification of an aggregate UDF.
  • Example  

    Consider the following table definition and inserted data:

       CREATE MULTISET TABLE t
          (id INTEGER,
           v  INTEGER);
     
       INSERT INTO t VALUES (1,1);
       INSERT INTO t VALUES (1,2);
       INSERT INTO t VALUES (1,2);
       INSERT INTO t VALUES (1,4);
       INSERT INTO t VALUES (1,5);
       INSERT INTO t VALUES (1,5);
       INSERT INTO t VALUES (1,5);
       INSERT INTO t VALUES (1,8);
       INSERT INTO t VALUES (1,);

    The following is the SQL definition of a window aggregate UDF that performs the dense rank operation:

       REPLACE FUNCTION dense_rank (x INTEGER)
       RETURNS INTEGER
       CLASS AGGREGATE (1000)
       LANGUAGE C
       NO SQL
       PARAMETER STYLE SQL
       DETERMINISTIC
       CALLED ON NULL INPUT
       EXTERNAL;

    The dense_rank UDF evaluates dense rank over the set of values passed as arguments to the UDF. With dense ranking, items that compare equal receive the same ranking number, and the next item(s) receive the immediately following ranking number. In the following query and result, note the difference in the rank and dense rank value for v=4. The dense rank value is 4 whereas the rank of 4 is 5.

       SELECT v, dense_rank(v) OVER (PARTITION BY id ORDER BY v 
          ROWS UNBOUNDED PRECEDING) as dr,
          rank() OVER (PARTITION BY id ORDER BY v) as r
       FROM t ORDER BY dr;

    The output from the SELECT statement is:

              v           dr            r
    -----------  -----------  -----------
              ?            1            1
              1            2            2
              2            3            3
              2            3            3
              4            4            5
              5            5            6
              5            5            6
              5            5            6
              8            6            9

    For a C code example of the dense_rank UDF, see “C Window Aggregate Function” in SQL External Routine Programming.

    Related Topics

     

    FOR more information on …

    SEE …

    aggregate UDFs

    “Aggregate UDF” on page 1327.

    ordered analytical functions and the window feature

    “The Window Feature” on page 977.

    implementing window aggregate UDFs

    SQL External Routine Programming.

  • CREATE FUNCTION
  • REPLACE FUNCTION
  • SQL Data Definition Language.
  • Database Administration.
  • EXECUTE FUNCTION and UDTUSAGE privileges

    SQL Data Control Language.