15.10 - Example: Creating a Java Aggregate Function - Teradata Database

Teradata Database SQL Data Definition Language Syntax and Examples

prodname
Teradata Database
vrm_release
15.10
created_date
December 2015
category
Programming Reference
featnum
B035-1144-151K

The following request creates a Java aggregate UDF named std_dev:

     CREATE FUNCTION std_dev(
        x FLOAT)
     RETURNS FLOAT
     CLASS AGGREGATE(79)
     LANGUAGE JAVA
     NO SQL
     PARAMETER STYLE JAVA
     EXTERNAL NAME 'UDF_JAR:UserDefinedFunctions.std_dev(
                    com.teradata.fnc.Phase,
                    com.teradata.fnc.Context[],
                    double) 
                    returns java.lang.Double';

For aggregate UDFs, you must specify two extra object type parameters as part of the Java method: com.teradata.fnc.Phase , which specifies the current aggregate phase, and com.teradata.fnc.Context, which enables you to get or set the context during the execution. For details about these two parameters, see SQL External Routine Programming, B035-1147.

You can store the intermediate result of aggregate UDFs using either a byte-based or an object-based approach. The byte-based approach consumes less memory and provides better performance, but you must encode and decode the byte array stored in the intermediate storage yourself rather than the system doing the encoding and decoding for you automatically. The object-based method is more user-friendly because the encoding and decoding is done by Java object serialization, but it also affects performance negatively.

The object-based approach for Java aggregate functions requires you to specify a number of bytes with the CLASS AGGREGATE phrase whenever the number of bytes required by intermediate aggregate storage exceeds the default value of 64.

The specified CLASS AGGREGATE value is the maximum number of bytes that the intermediate aggregate storage can occupy in memory. Always set the value as small as possible, because its negative affect on the performance of the function is positively correlated with the class aggregate size.

For Java aggregate functions, the intermediate result is stored as an object that is serialized to a byte array. To keep the size of the intermediate storage small, it is a good practice to avoid using inner classes and to choose short names for the class and its data members. You should also calculate the number of bytes that is required to store the intermediate result for the object, then specify that number in the CLASS AGGREGATE clause. The following code fragment shows how to calculate the serialized size of an object in bytes:

     public static int getSize(Object obj){
            int size=0;
            try{
              ByteArrayOutputStream barr = new ByteArrayOutputStream();
              ObjectOutput s = new ObjectOutputStream(barr);
              s.writeObject(obj);
              s.close();
              size=barr.toByteArray().length;
              System.out.println("obj="+obj+",size="+size);
            }catch(IOException e){
               e.printStackTrace();
            }
              return size;
         }