During the Phase.AGR_INIT aggregation phase, a UDF must also save the first set of data values passed in through arguments into the intermediate aggregate storage area. The first set of data values are for the first row to aggregate for the group.
Thereafter, each time the method is invoked during the Phase.AGR_DETAIL aggregation phase, it must accumulate the row data passed in through arguments into the intermediate aggregate storage area for the specific group. Each group being aggregated to has a separate intermediate storage area.
During the Phase.AGR_DETAIL aggregation phase, the method gains access to previously accumulated data by calling getObject() or getBytes() on the context input argument. For the standard deviation example, the following code gains access to previously accumulated data:
agr_storage s1 = (agr_storage)context[0].getObject(1);
Alternatively, if the method used a byte array to store data, the following code gains access to the previously stored values:
ByteBuffer s1 = ByteBuffer.wrap(context[0].getBytes(1)); double count = s1.getDouble(); double x_sq = s1.getDouble(); double x_sum = s1.getDouble();
- sum(X 2)
- sum(X)
If the method uses the agr_storage object to store data, the following code performs the necessary calculations using the column value, and then combine the results with the intermediate storage:
s1.count++; s1.x_sq += x*x; s1.x_sum += x; context[0].setObject( 1, s1 );
Alternatively, if the method uses a byte array to store data, the code looks like this:
s1 = ByteBuffer.allocate(24); count++; x_sq += x*x; x_sum += x; s1.putDouble(count); s1.putDouble(x_sq); s1.putDouble(x_sum); context[0].setBytes( 1, s1.array() );