An aggregator is a function that combines a set of rows into a single output row. An aggregator function is designed to work in a distributed setting where the input set of rows can be combined in a commutative and associative fashion until a single output row is produced; hence, the function must be able to combine both input rows and partially aggregated results.
For example, a partially aggregated row of an aggregator that computes an average would consist of a sum and a count. Moreover, as the function might be used to combine rows from multiple groups, it must provide the capability to be reset between group or partition breaks.
The interface for implementing an aggregator is DecomposableAggregatorFunction, which is a sub-interface of AggregatorFunction. DecomposableAggregatorFunction provides these methods:
- The reset() method is used to initialize the aggregator or reset it between group or partition breaks. Note that an aggregator could be asked to return a value immediately after it is reset. This could occur in cases where an input table, group, or partition is empty; consequently, the initial value should reflect the correct aggregator semantics for empty input.
- The aggregateRow() method updates the aggregator with a new input row.
- The aggregatePartialRow() method updates the aggregator with a new partially aggregated row.
- The getPartialRow() method returns the current value of the aggregator in its partially aggregated format (for example, sum and count).
- The getFinalRow() method returns the current value of the aggregator in its final aggregated format.(for example, sum / count).
In addition to these five methods, an aggregator must also define a public constructor that completes an aggregator runtime contract (DecomposableAggregatorRuntimeContract).
The following table describes the primary global aggregator classes and interfaces.
Name | Type | Description |
---|---|---|
DecomposableAggregatorFunction | Interface | A sub-interface of AggregatorFunction. Provides methods for aggregating rows and partial rows, as well as methods for resetting the aggregator and producing final aggregator results. |
GraphRuntimeContract | Final Class | Provides methods for registering aggregators. Extends BaseRuntimeContract. |
GraphRuntimeContract.CompletedContract | Static Final Class | Provides an immutable description of a graph runtime contract which includes methods to retrieve the completed contracts of registered aggregators. Extends BaseRuntimeContract.CompletedContract. |
GraphGlobals | Final Class | Provides a graph function with access to a common global state which includes read and update access to registered aggregators. |
In ADE, you can add global aggregators to your SQL-GR function.