Compatibility Considerations for UDFs

Compatibility Considerations for UDFs - Teradata Vantage

Teradata Vantage™ User Guide

Product

Teradata Vantage

Release Number

1.1

Published

May 2020

Language

English (United States)

Last Update

2020-05-28

dita:mapPath

ioz1543440393126.ditamap

dita:ditavalPath

jsj1481748799576.ditaval

dita:id

B700-4002

lifecycle

Product Category

Teradata Vantage

To ensure the compatibility of UDFs developed on ML Engine with Advanced SQL Engine, consider the following:

For a function implementing a RowFunction interface (function with input ON clause without PARTITION BY or with PARTITION BY ANY), input data is transferred from Advanced SQL Engine to ML Engine as a keyless table (a table with no partition key). Therefore, different executions of the function may result in different data distribution on ML Engine.
For a function implementing a PartitionFunction interface (a function with an input ON clause with PARTITION BY KEY), input data is transferred from Advanced SQL Engine to ML Engine based on the partitioning key. Therefore, different executions of the function result in the same data distribution on ML Engine. However, the order of input data in each distribution may vary. Use an ORDER BY clause with the input ON clause to ensure that the function processes input in the same order every time.
Input tables on which a function operates are temporarily created tables in ML Engine and are cleaned up after function execution. When the function calls the getInputNames of class baseRuntimeContract.CompletedContract in ML Engine, it gets the name of the temporary table instead of the input table provided in the SQL call on Advanced SQL Engine.
Tables output by a function are also temporarily created tables in ML Engine. ML Engine transfers these tables to Advanced SQL Engine after the execution, renamed on Advanced SQL Engine based on the names supplied in the SQL call, and removed from ML Engine afterwards.
Output columns created by the function must begin with a lowercase letter or an underscore (_). Subsequent characters can be lowercase letters, digits (0-9), or underscores.
Select the data type of output produced by the functions based on the data type mapping table in Data Type Mapping between Advanced SQL Engine and ML Engine. An unbounded VARCHAR data type on ML Engine maps to a CLOB data type on Advanced SQL Engine. Use VARCHAR(n) if the data in the output is not expected to grow beyond n=32000 (the maximum for a bounded VARCHAR). Similarly, use bounded numeric data types instead of unbounded numeric data types.
Most driver functions require a JSON function descriptor file inside its archive (jar or zip). A driver function is a function that uses a JDBC driver inside its code to execute SQL statements. If a driver function has any of these characteristics, it will need a JSON function descriptor to execute correctly:
- Its implementation takes the name of an input table from the function argument clause.
- It creates an output table (secondary output) in addition to the SELECT statement output (primary output).
To return syntax and usage information (using the HELP FOREIGN FUNCTION command), a function must have a JSON function descriptor file packaged inside its archive (jar or zip).

See JSON Function Descriptor for the specifications of a JSON file and steps to include it in the function archive. Any non-driver function or an auxiliary function (a function used internally by a driver function and does not appear in the SQL call on Advanced SQL Engine) does not need a JSON file in order to be executed.