Aggregate Functions in Java - Aster Execution Engine

Teradata Aster® Developer Guide

Product
Aster Execution Engine
Release Number
7.00.02
Published
July 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
xnl1494366523182.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
ffu1489104705746
lifecycle
previous
Product Category
Software

This section guides you through writing and invoking a simple user defined aggregate function in Java. It assumes you have downloaded the Aster Execution Engine SQL-Analytics SDK.

To write a user defined function in Java, you create a Java class that implements one of the following interfaces:
  • com.asterdata.ncluster.aggregator.DecomposableAggregtorFunction
  • com.asterdata.ncluster.aggregator.NonDecomposableAggregtorFunction

The decomposable aggregate class must implement a public constructor that takes:

com.asterdata.ncluster.aggregator.DecomposableAggregtor.RuntimeContract.

The non-decomposable aggregate class must implement a public constructor that takes:

com.asterdata.ncluster.aggregator.NonDecomposableAggregtorRuntimeContract

The name of your user defined Java function is the name of the Java class, ignoring case differences. So, for example, a function named count might be implemented by a Java class com.mycompany.Count.

Teradata Aster’s SQL-Analytics framework supports two types of functions. A user defined aggregate function must implement one of these two interfaces:

  • DecomposableAggregatorFunction

    A DecomposableAggregatorFunction consumes multiple rows as input and returns a single value for each input partition. Partition numbers are decided by the distinct values specified by the optional GROUP BY clause. The default partition is one for the entire table if there is no GROUP BY clause. From an interface perspective, the function will be passed an iterator to an arbitrary set of rows in the same partition. A DecomposableAggregatorFunction consists of six functions:

    • the constructor
    • two operator functions: aggregateRow and aggregatePartialRow
    • two getter functions: getPartialRow and getFinalValue
    • reset function: reset
  • NonDecomposableAggregatorFunction

    A NonDecomposableAggregatorFunction consumes multiple rows as input and returns a single value for each input partition. Partition numbers are decided by the distinct values specified by the optional GROUP BY clause. The default partition is one for the entire table if there is no GROUP BY clause. From an interface perspective, the function will be passed an iterator to an arbitrary set of rows in the same partition. A NonDecomposableAggregatorFunction consists of four functions:

    • the constructor
    • operator function: aggregateRow
    • getter function: getFinalValue
    • reset function: reset

From an external user's perspective, the decomposable and non-decomposable aggregate function outputs the same value as the input. The difference is during the internal execution, a decomposable aggregate function enables more parallel execution and hence can have better performance.