Example: Creating a Java Aggregate Function - Advanced SQL Engine - Teradata Database

SQL Data Definition Language Syntax and Examples

Product
Advanced SQL Engine
Teradata Database
Release Number
17.00
Published
September 2020
Language
English (United States)
Last Update
2021-01-23
dita:mapPath
wgr1555383704548.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1144
lifecycle
previous
Product Category
Teradata Vantage™

The following request creates a Java aggregate UDF named std_dev:

     CREATE FUNCTION std_dev(
        x FLOAT)
     RETURNS FLOAT
     CLASS AGGREGATE(79)
     LANGUAGE JAVA
     NO SQL
     PARAMETER STYLE JAVA
     EXTERNAL NAME 'UDF_JAR:UserDefinedFunctions.std_dev(
                    com.teradata.fnc.Phase,
                    com.teradata.fnc.Context[],
                    double) 
                    returns java.lang.Double';

For aggregate UDFs, you must specify two extra object type parameters as part of the Java method: com.teradata.fnc.Phase , which specifies the current aggregate phase, and com.teradata.fnc.Context, which enables you to get or set the context during the execution. For details about these two parameters, see Teradata Vantage™ - SQL External Routine Programming , B035-1147 .

You can store the intermediate result of aggregate UDFs using either a byte-based or an object-based approach. The byte-based approach consumes less memory and provides better performance, but you must encode and decode the byte array stored in the intermediate storage yourself rather than the system doing the encoding and decoding for you automatically. The object-based method is more user-friendly because the encoding and decoding is done by Java object serialization, but it also affects performance negatively.

The object-based approach for Java aggregate functions requires you to specify a number of bytes with the CLASS AGGREGATE phrase whenever the number of bytes required by intermediate aggregate storage exceeds the default value of 64.

The specified CLASS AGGREGATE value is the maximum number of bytes that the intermediate aggregate storage can occupy in memory. Always set the value as small as possible, because its negative affect on the performance of the function is positively correlated with the class aggregate size.

For Java aggregate functions, the intermediate result is stored as an object that is serialized to a byte array. To keep the size of the intermediate storage small, it is a good practice to avoid using inner classes and to choose short names for the class and its data members. You should also calculate the number of bytes that is required to store the intermediate result for the object, then specify that number in the CLASS AGGREGATE clause. The following code fragment shows how to calculate the serialized size of an object in bytes:

     public static int getSize(Object obj){
            int size=0;
            try{
              ByteArrayOutputStream barr = new ByteArrayOutputStream();
              ObjectOutput s = new ObjectOutputStream(barr);
              s.writeObject(obj);
              s.close();
              size=barr.toByteArray().length;
              System.out.println("obj="+obj+",size="+size);
            }catch(IOException e){
               e.printStackTrace();
            }
              return size;
         }