15.00 - Aggregate Functions - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)
Last Update
2018-09-24

Aggregate Functions

Aggregate functions are typically used in arithmetic expressions. Aggregate functions operate on a group of rows and return a single numeric value in the result table for each group.

In the following statement, the SUM aggregate function operates on the group of rows defined by the Sales_Table table:

   SELECT SUM(Total_Sales)
   FROM Sales_Table;
 
   Sum(Total_Sales)
   ----------------
            5192.40

You can use GROUP BY clauses to produce more complex, finer grained results in multiple result values. In the following statement, the SUM aggregate function operates on groups of rows defined by the Product_ID column in the Sales_Table table:

   SELECT Product_ID, SUM(Total_Sales)
   FROM Sales_Table
   GROUP BY Product_ID;
 
   Product_ID  Sum(Total_Sales)
   ----------  ----------------
          101           2100.00
          107           1000.40
          102           2092.00

Aggregates in the Select List

Aggregate functions are normally used in the expression list of a SELECT statement and in the summary list of a WITH clause.

Aggregates and GROUP BY

If you use an aggregate function in the select list of an SQL statement, then either all other columns occurring in the select list must also be referenced by means of aggregate functions or their column name must appear in a GROUP BY clause. For example, the following statement uses an aggregate function and a column in the select list and references the column name in the GROUP BY clause:

   SELECT COUNT(*), Product_ID
   FROM Sales_Table
   GROUP BY Product_ID;

The reason for this is that aggregates return only one value, while a non-GROUP BY column reference can return any number of values.

Aggregates and Date

It is valid to apply AVG, MIN, MAX, or COUNT to a date. It is not valid to specify SUM(date).

Aggregates and Literal Expressions in the Select List

Literal expressions in the select list may optionally appear in the GROUP BY clause. For example, the following statement uses an aggregate function and a literal expression in the select list, and does not use a GROUP BY clause:

   SELECT COUNT(*), 
   SUBSTRING( CAST( CURRENT_TIME(0) AS CHAR(14) ) FROM 1 FOR 8 )
   FROM Sales_Table;

The results of such statements when the table has no rows depends on the type of literal expression.

 

IF the literal expression …

THEN the result of the literal expression in the query result is …

does not contain a column reference

the value of the literal expression.

Functions such as RANDOM are computed in the immediate retrieve step of the request instead of in the aggregation step.

Here is an example:

SELECT COUNT(*), 
SUBSTRING(CAST(CURRENT_TIME(0) AS CHAR(14))
FROM 1 FOR 8)
FROM Sales_Table;
Count(*) Substring(Current Time(0) From 1 For 8)
-------- ---------------------------------------
       0 09:01:43

is a non-deterministic function, such as RANDOM

contains a column reference

NULL.

Here is an example:

SELECT COUNT(*), UDF_CALC(1,2) 
FROM Sales_Table;
   Count(*) UDF_CALC(1,2)
----------- -------------
          0             ?

is a UDF

Nesting Aggregates

Aggregate operations cannot be nested. The following aggregate is not valid and returns an error:

   AVG(MAXIMUM (Salary)) 

But aggregates can be nested in aggregate window functions. The following statement is valid and includes an aggregate SUM function nested in a RANK window function:

   SELECT region
      ,product
      ,SUM(amount)
      ,RANK() OVER (PARTITION BY region ORDER by SUM (amount))
   FROM table;

For details on aggregate window functions, see Chapter 22: “Ordered Analytical / Window Aggregate Functions.”

Results of Aggregation on Zero Rows

Aggregation on zero rows behaves as indicated by the following table.

 

This form of aggregate function …

Returns this result when there are zero rows …

COUNT(expression) WHERE …

0

all other forms of aggregate_operator(expression) WHERE …

Null

aggregate_operator(expression) … GROUP BY …

No Record Found

aggregate_operator(expression) … HAVING …

Aggregates and Nulls

Aggregates (with the exception of COUNT(*)) ignore nulls in all computations.

Note: A UDT column value is null only when you explicitly place a NULL in a column, not when a UDT instance has an attribute that is set to null.

Ignoring nulls can result in apparent nontransitive anomalies. For example, if there are nulls in either column A or column B (or both), then the following expression is virtually always true.

   SUM(A) + SUM(B) <> SUM(A+B) 

The only exception to this is the case in which the values for columns A and B are both null in the same rows, because in those cases the entire row is disregarded in the aggregation. This is a trivial case that does not violate the general rule.

More formally stated, if and only if field A and field B are both null for every occurrence of a null in either field is the above inequality false.

For examples that illustrate this behavior, see “Example 2” on page 42 and “Example 3” on page 42. Note that the aggregates are behaving exactly as they should, the results are not mathematically anomalous.

There are several ways to work around this apparent nontransitivity issue if it presents a problem. Either solution provides the same consistent results.

  • Always define your numeric columns as NOT NULL DEFAULT 0
  • Use the ZEROIFNULL function within the aggregate function to convert any nulls to zeros for the computation, for example SUM(ZEROIFNULL(x) + ZEROIFNULL(y)), which produces the same result as SUM(ZEROIFNULL(x) + ZEROIFNULL(y)).
  • Aggregate Operations on Floating Point Data

    Operations involving floating point numbers are not always associative due to approximation and rounding errors: ((A + B) + C) is not always equal to (A + (B + C)).

    Although not readily apparent, the non-associativity of floating point arithmetic can also affect aggregate operations: you can get different results each time you use an aggregate function on a given set of floating point data. When Teradata Database performs an aggregation, it accumulates individual terms from each AMP involved in the computation and evaluates the terms in order of arrival to produce the final result. Because the order of evaluation can produce slightly different results, and because the order in which individual AMPs finish their part of the work is unpredictable, the results of an aggregate function on the same data on the same system can vary.

    For more information on potential problems associated with floating point values in computations, see SQL Data Types and Literals.

    Aggregates and LOBs

    Aggregates do not operate on CLOB or BLOB data types.

    Aggregates and Period Data Types

    Aggregates (with the exception of COUNT) do not operate on Period data types.

    Aggregates and SELECT AND CONSUME Statements

    Aggregates cannot appear in SELECT AND CONSUME statements.

    Aggregates and Recursive Queries

    Aggregate functions cannot appear in a recursive statement of a recursive query. However, a non-recursive seed statement in a recursive query can specify an aggregate function.

    Aggregates in WHERE and HAVING Clauses

    Aggregates can appear in the following types of clauses:

  • The WHERE clause of an ABORT statement to specify an abort condition.
  • But an aggregate function cannot appear in the WHERE clause of a SELECT statement.

  • A HAVING clause to specify a group condition.
  • DISTINCT Option

    The DISTINCT option specifies that duplicate values are not to be used when an expression is processed.

    The following SELECT returns the number of unique job titles in a table.

       SELECT COUNT(DISTINCT JobTitle) FROM Employee;

    A query can have multiple aggregate functions that use DISTINCT with the same expression, as shown by the following example.

       SELECT SUM(DISTINCT x), AVG(DISTINCT x) FROM XTable;

    A query can also have multiple aggregate functions that use DISTINCT with different expressions, for example:

       SELECT SUM(DISTINCT x), SUM(DISTINCT y) FROM XYTable;

    Aggregates and Row Level Security Tables

    When a request that includes an aggregate function, such as SUM, COUNT, MAX, MIN or AVG, references a table protected by row level security, the aggregation is based on only the rows that are accessible to the requesting user. In order to apply all rows of the table to the aggregation, the user must have one of the following:

  • The required security credentials to access all rows of the table.
  • The required OVERRIDE privileges on the security constraints in the table.
  • For more information about row level security, see Security Administration.