User-defined Functions
Use the CREATE/REPLACE FUNCTION statement to create a user-defined function (UDF) containing one or more parameters, or a DATASET return type of any of supported storage format. The parameters and/or return types are supported on scalar, aggregate, and table UDFs, and SQL UDFs. When the return type is specified as one of these types for SQL UDF, the RETURN expression may be an SQL statement evaluating to one type.
The DATASET type as a parameter to a UDF is supported for LANGUAGE C, CPP, and JAVA, but LANGUAGE R is not supported.
- The row inserted to DBC.TVFields to record metadata information about the DATASET field indicates it is a DATASET type. It shares some entries with the UDTs. The FieldType is 'DT,' and the TypeId corresponds to the static type ID assigned to the DATASET type.
Table Operators
The DATASET type is supported in C language and Java language user-defined table operators. The metadata is passed to the table operator contract function using an external type code such as DATASET_AVRO_DT or DATASET_CSV_DT. The type codes for dtype_en include the following. See sqltypes_td.h for the complete definition.
typedef enum dtype_en { UNDEF_DT=0, CHAR_DT=1, VARCHAR_DT=2, BYTE_DT=3, VARBYTE_DT=4, GRAPHIC_DT=5, VARGRAPHIC_DT=6, BYTEINT_DT=7, SMALLINT_DT=8, INTEGER_DT=9, BIGINT_DT = 36, REAL_DT=10, DECIMAL1_DT=11,' DECIMAL2_DT=12, DECIMAL4_DT=13, DECIMAL8_DT=14, DECIMAL16_DT=37, DATE_DT=15, TIME_DT=16, TIMESTAMP_DT=17, INTERVAL_YEAR_DT=18, INTERVAL_YTM_DT=19, INTERVAL_MONTH_DT=20, INTERVAL_DAY_DT=21, INTERVAL_DTH_DT=22, INTERVAL_DTM_DT=23, INTERVAL_DTS_DT=24, INTERVAL_HOUR_DT=25, INTERVAL_HTM_DT=26, INTERVAL_HTS_DT=27, INTERVAL_MINUTE_DT=28, INTERVAL_MTS_DT=29, INTERVAL_SECOND_DT=30, TIME_WTZ_DT=31, TIMESTAMP_WTZ_DT=32, BLOB_REFERENCE_DT=33, CLOB_REFERENCE_DT=34, UDT_DT=35, /* The 8 byte integer type (BIGINT_DT) and * the 16 byte decimal type (DECIMAL16_DT) * are located above and have the following * values: * * BIGINT_DT=36 * DECIMAL16_DT=37 */ NUMBER_DT=38, PERIOD_DT=39, JSON_DT=40, DATASET_AVRO_DT=41, ST_GEOMETRY_DT=42, MBR_DT=43, MBB_DT=44, ARRAY_DT=45, XML_DT = 46, DATASET_CSV_DT=47, FNC_DATATYPESETSIZE=48 } dtype_en;
For AVRO, the complex types map to the following base type, DATASET_AVRO_DT → BLOB_REFERENCE_DT. For CSV, the complex types map to the following base type, DATASET_CSV_DT → CLOB_REFERENCE_DT
When input data values are sent to a table operator, the data is transferred in the current default transform. Each possible transform type populates the UDT_BaseInfo_t.transform_info structure, as shown in the following table:
Transform Type | Datatype | Column | … | Size.length |
---|---|---|---|---|
TD_DATASET_AVRO_VARBYTE | VARBYTE_DT | <name of the column> | The data size | |
TD_DATASET_AVRO_BLOB | BLOB_REFERENCE_DT | <name of the column> | The data size |
Transform Type | Datatype | Column | ... | Charset | ... | Size.length |
---|---|---|---|---|---|---|
TD_CSV_CLOB | CLOB_REFERENCE_DT | <name of the column> | The data size |
There is no "transforms off" functionality. The UDT_BaseInfo_t structure's udt_indicator member value identifies the DATASET storage formats:
10==DATASET STORAGE FORMAT AVRO
11==DATASET STORAGE FORMAT CSV
For CSV, the UDT_BaseInfo_t's charset field determines the character set as either LATIN or UNICODE.
External Stored Procedures
The CREATE/REPLACE PROCEDURE statement was extended to create an external stored procedure containing one or more parameters that are DATASET types of any of the supported storage formats. Use these types to define the IN, OUT, or INOUT parameters.
The DATASET type as an IN, OUT, or INOUT parameter to an external stored procedure is supported for LANGUAGE C, CPP, or JAVA. The LANGUAGE R option is not supported.
- The row normally inserted to DBC.TVFields to record metadata information about the DATASET field was enhanced to indicate that it is a DATASET type. It shares some entries with the UDTs. The FieldType is 'DT', and the TypeId corresponds to the static type ID assigned to the DATASET type.