This section guides you through writing and invoking a simple user defined aggregate function in Java. It assumes you have downloaded the Aster Execution Engine SQL-Analytics SDK.
- com.asterdata.ncluster.aggregator.DecomposableAggregtorFunction
- com.asterdata.ncluster.aggregator.NonDecomposableAggregtorFunction
The decomposable aggregate class must implement a public constructor that takes:
com.asterdata.ncluster.aggregator.DecomposableAggregtor.RuntimeContract.
The non-decomposable aggregate class must implement a public constructor that takes:
com.asterdata.ncluster.aggregator.NonDecomposableAggregtorRuntimeContract
The name of your user defined Java function is the name of the Java class, ignoring case differences. So, for example, a function named count might be implemented by a Java class com.mycompany.Count.
Teradata Aster’s SQL-Analytics framework supports two types of functions. A user defined aggregate function must implement one of these two interfaces:
-
DecomposableAggregatorFunction
A DecomposableAggregatorFunction consumes multiple rows as input and returns a single value for each input partition. Partition numbers are decided by the distinct values specified by the optional GROUP BY clause. The default partition is one for the entire table if there is no GROUP BY clause. From an interface perspective, the function will be passed an iterator to an arbitrary set of rows in the same partition. A DecomposableAggregatorFunction consists of six functions:
- the constructor
- two operator functions: aggregateRow and aggregatePartialRow
- two getter functions: getPartialRow and getFinalValue
- reset function: reset
-
NonDecomposableAggregatorFunction
A NonDecomposableAggregatorFunction consumes multiple rows as input and returns a single value for each input partition. Partition numbers are decided by the distinct values specified by the optional GROUP BY clause. The default partition is one for the entire table if there is no GROUP BY clause. From an interface perspective, the function will be passed an iterator to an arbitrary set of rows in the same partition. A NonDecomposableAggregatorFunction consists of four functions:
- the constructor
- operator function: aggregateRow
- getter function: getFinalValue
- reset function: reset
From an external user's perspective, the decomposable and non-decomposable aggregate function outputs the same value as the input. The difference is during the internal execution, a decomposable aggregate function enables more parallel execution and hence can have better performance.