16.20 - DATASET Data Type - Teradata Vantage NewSQL Engine

Teradata Vantageā„¢ NewSQL Engine Release Summary

prodname
Teradata Database
Teradata Vantage NewSQL Engine
vrm_release
16.20
created_date
March 2019
category
Release Notes
featnum
B035-1098-162K
The Teradata DATASET data type is a new Complex Data Type (CDT) representing self-describing files interpreted based on a schema. This feature provides the following functionality to support the storage and processing of DATASET data in Teradata Database:
  • A DATASET data type, stored in the Avro file format. The DATASET data type allows variable types of storage formats, but for Release 16.00, only Avro is supported.
  • Methods, functions, and procedures for processing, shredding, and publishing DATASET data.
  • DATASET documents up to 16 MB in size.
The feature also provides enhanced dot notation. Dot notation was introduced for the JSON data type in Release 15.0, and is now extended to the DATASET data type. Dot notation includes the following syntax:
  • Recursive descent operator (..)
  • Wildcards (*) - both in reference to named and indexed items
  • Name/index lists ([a,b,c] or [0,3,5])
  • Name/index slices ([a:c] or [0:5])
  • Simple name/index references

Benefits

The DATASET data type provides the following functionality for this data:
  • Variable length: Both the maximum length and the in-row length are variable.
  • Variable format of data stored: DATASET supports a built-in storage format for the Avro format. The data type type allows variable types of storage formats, but for now, only Avro is supported.
  • Variable schema for data stored in any format: Users may define schemas at the column-level or instance-level for any of the built-in storage formats of the DATASET type. Column-level schemas are binding for all instances of the data type loaded into that particular column, whereas instance-level schemas may vary from instance to instance.
  • Methods: Methods to operate on the DATASET type in any storage format and with any schema
  • Functions: Functions to operate on the DATASET type in any storage format and with any schema
  • Publishing to DATASET data type: Use data stored in relational tables to compose a DATASET type with any storage format and any schema.
  • DATASET shredding: Allows you to extract values from DATASET documents and store the extracted data in relational format.

Avro data stored as a DATASET data type transforms to and from SQL as a VARBYTE or BLOB.

SQL Changes

The following statements are new for the DATASET data type:
  • CREATE <storage-format-name> SCHEMA
  • DROP <storage-format-name> SCHEMA
  • SHOW <storage-format-name> SCHEMA
  • HELP <storage-format-name> SCHEMA
  • SET SESSION DOT NOTATION
The following statements were modified for the DATASET data type:
  • CREATE TABLE
  • ALTER TABLE
  • CREATE/REPLACE FUNCTION
  • CREATE INDEX
  • COLLECT STATISTICS
  • HELP, SHOW, and TYPE commands

Additional Information

For more information, see Teradata Vantageā„¢ DATASET Data Type, B035-1198.