DATASET Type Usage | DATASET Data Type | VantageCloud Lake - DATASET Type Usage - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

User-Defined Functions

Use the CREATE/REPLACE FUNCTION statement to create a user-defined function (UDF) containing one or more parameters, or a DATASET return type of any of supported storage format. The parameters and return types are supported on scalar, aggregate, and table UDFs, and SQL UDFs. When the return type is specified as one of these types for SQL UDF, the RETURN expression may be an SQL statement evaluating to one type.

The DATASET type as a parameter to a UDF is supported for LANGUAGE C, CPP, and JAVA, but LANGUAGE R is not supported.

The following actions occur on the Data Dictionary CREATE FUNCTION statement for the DATASET type:
  • Dictionary updates that typically occur on a CREATE FUNCTION statement.
  • The row inserted to DBC.TVFields to record metadata information about the DATASET field indicates that the field is a DATASET type. The field shares entries with the UDTs. The FieldType is 'DT', and the TypeId corresponds to the static type ID assigned to the DATASET type.

Table Operators

The DATASET type is supported in C language and Java language user-defined table operators. The metadata is passed to the table operator contract function using an external type code such as DATASET_AVRO_DT or DATASET_CSV_DT. The type codes for dtype_en include the following. See sqltypes_td.h for the complete definition.

typedef enum dtype_en
{
        UNDEF_DT=0,
        CHAR_DT=1,
        VARCHAR_DT=2,
        BYTE_DT=3,
        VARBYTE_DT=4,
        GRAPHIC_DT=5,
        VARGRAPHIC_DT=6, 
        BYTEINT_DT=7,
        SMALLINT_DT=8,
        INTEGER_DT=9,
        BIGINT_DT = 36,
        REAL_DT=10,
        DECIMAL1_DT=11,'
        DECIMAL2_DT=12,
        DECIMAL4_DT=13,
        DECIMAL8_DT=14, 
        DECIMAL16_DT=37,  
        DATE_DT=15,
        TIME_DT=16,
        TIMESTAMP_DT=17,
        INTERVAL_YEAR_DT=18,
        INTERVAL_YTM_DT=19,
        INTERVAL_MONTH_DT=20,
        INTERVAL_DAY_DT=21,
        INTERVAL_DTH_DT=22,
        INTERVAL_DTM_DT=23,
        INTERVAL_DTS_DT=24,
        INTERVAL_HOUR_DT=25,
        INTERVAL_HTM_DT=26,
        INTERVAL_HTS_DT=27,
        INTERVAL_MINUTE_DT=28,
        INTERVAL_MTS_DT=29,
        INTERVAL_SECOND_DT=30,
        TIME_WTZ_DT=31,
        TIMESTAMP_WTZ_DT=32,
        BLOB_REFERENCE_DT=33,
        CLOB_REFERENCE_DT=34, 
        UDT_DT=35,
        /* The 8 byte integer type (BIGINT_DT) and 
         * the 16 byte decimal type (DECIMAL16_DT) 
         * are located above and have the following
         * values:
         *
         * BIGINT_DT=36    
         * DECIMAL16_DT=37 
         */                   
        NUMBER_DT=38, 
        PERIOD_DT=39, 
        JSON_DT=40, 
        DATASET_AVRO_DT=41,
        ST_GEOMETRY_DT=42,
        MBR_DT=43,
        MBB_DT=44,
        ARRAY_DT=45,
        XML_DT = 46, 
	    DATASET_CSV_DT=47,
        FNC_DATATYPESETSIZE=48
} dtype_en;

For AVRO, the complex types map to the following base type, DATASET_AVRO_DT → BLOB_REFERENCE_DT. For CSV, the complex types map to the following base type, DATASET_CSV_DT → CLOB_REFERENCE_DT

When input data values are sent to a table operator, the data is transferred in the current default transform. Each possible transform type populates the UDT_BaseInfo_t.transform_info structure, as shown in the following table:

Transform Type Datatype Column Size.length
TD_DATASET_AVRO_VARBYTE VARBYTE_DT column_name The data size
TD_DATASET_AVRO_BLOB BLOB_REFERENCE_DT column_name The data size
For CSV,the transform types populate the UDT_BaseInfo_t.transform_info structure, as shown in the following table:
Transform Type Datatype Column ... Charset ... Size.length
TD_CSV_CLOB CLOB_REFERENCE_DT <name of the column>       The data size

There is no "transforms off" functionality. The udt_indicator member value of the UDT_BaseInfo_t structure identifies the DATASET storage formats:

10==DATASET STORAGE FORMAT AVRO

11==DATASET STORAGE FORMAT CSV

For CSV, the charset field of the UDT_BaseInfo_t structure determines the character set as either LATIN or UNICODE.

External Stored Procedures

The CREATE/REPLACE PROCEDURE statement was extended to create an external stored procedure containing one or more parameters that are DATASET types of any of the supported storage formats. Use these types to define the IN, OUT, or INOUT parameters.

The DATASET type as an IN, OUT, or INOUT parameter to an external stored procedure is supported for LANGUAGE C, CPP, or JAVA. The LANGUAGE R option is not supported.

The following actions occur on the Data Dictionary on a CREATE/REPLACE PROCEDURE statement for the DATASET type:
  • Dictionary updates that typically occur on a CREATE/REPLACE PROCEDURE statement.
  • The row typically inserted to DBC.TVFields to record metadata information about the DATASET field was enhanced to indicate that the field is a DATASET type. The field shares entries with the UDTs. The FieldType is 'DT', and the TypeId corresponds to the static type ID assigned to the DATASET type.