Aggregate Functions in Java - Aster Execution Engine

Teradata Aster® Developer Guide

Product

Aster Execution Engine

Release Number

7.00.02

Published

July 2017

Language

English (United States)

Last Update

2018-04-13

dita:mapPath

xnl1494366523182.ditamap

dita:ditavalPath

Generic_no_ie_no_tempfilter.ditaval

dita:id

ffu1489104705746

lifecycle

Product Category

Software

This section guides you through writing and invoking a simple user defined aggregate function in Java. It assumes you have downloaded the Aster Execution Engine SQL-Analytics SDK.

To write a user defined function in Java, you create a Java class that implements one of the following interfaces:

com.asterdata.ncluster.aggregator.DecomposableAggregtorFunction
com.asterdata.ncluster.aggregator.NonDecomposableAggregtorFunction

The decomposable aggregate class must implement a public constructor that takes:

com.asterdata.ncluster.aggregator.DecomposableAggregtor.RuntimeContract.

The non-decomposable aggregate class must implement a public constructor that takes:

com.asterdata.ncluster.aggregator.NonDecomposableAggregtorRuntimeContract

The name of your user defined Java function is the name of the Java class, ignoring case differences. So, for example, a function named count might be implemented by a Java class com.mycompany.Count.

Teradata Aster’s SQL-Analytics framework supports two types of functions. A user defined aggregate function must implement one of these two interfaces:

DecomposableAggregatorFunction
A DecomposableAggregatorFunction consumes multiple rows as input and returns a single value for each input partition. Partition numbers are decided by the distinct values specified by the optional GROUP BY clause. The default partition is one for the entire table if there is no GROUP BY clause. From an interface perspective, the function will be passed an iterator to an arbitrary set of rows in the same partition. A DecomposableAggregatorFunction consists of six functions:
- the constructor
- two operator functions: aggregateRow and aggregatePartialRow
- two getter functions: getPartialRow and getFinalValue
- reset function: reset
NonDecomposableAggregatorFunction
A NonDecomposableAggregatorFunction consumes multiple rows as input and returns a single value for each input partition. Partition numbers are decided by the distinct values specified by the optional GROUP BY clause. The default partition is one for the entire table if there is no GROUP BY clause. From an interface perspective, the function will be passed an iterator to an arbitrary set of rows in the same partition. A NonDecomposableAggregatorFunction consists of four functions:
- the constructor
- operator function: aggregateRow
- getter function: getFinalValue
- reset function: reset

From an external user's perspective, the decomposable and non-decomposable aggregate function outputs the same value as the input. The difference is during the internal execution, a decomposable aggregate function enables more parallel execution and hence can have better performance.