The DATASET data type, introduced in Teradata Database 16.0 to support self-describing data, was designed to support multiple storage formats. In the initial release, the DATASET data type supported the Avro storage format. This enhancement adds support for the Comma Separated Value (CSV) storage format. All of the features available with the existing DATASET type (for example, dot notation, column-level schemas, functions, and so on) are supported with the new CSV storage format.
Benefits
The following is provided for the CSV storage format:
- Variable length: Both the maximum length and the in-row length are variable.
- Variable format of data stored: Support for CSV format is added, in addition to the existing Avro storage format.
- Methods: Methods are provided to operate on the DATASET type in any storage format and with any schema.
- Functions: Functions are provided to operate on the DATASET type in any storage format and with any schema.
- Publishing to DATASET data type: Use data stored in relational tables to compose a DATASET type with any storage format and any schema.
- Shredding: Shred a CSV data type into a relational table.
- Converting: Convert a CSV data type into Avro or JSON.
- Enhanced dot notation: The extended dot notation introduced in Teradata Database 16.0 is supported for the new CSV storage format.
- CSV storage format supports some extensions to the CSV standard, such as user-specified column and record delimiters and header field names. If any of these extensions are used, a schema must be specified. Users may define schemas at the column-level or instance-level for any of the built-in storage formats of the DATASET type. Column-level schemas are binding for all instances of the data type loaded into that particular column, whereas instance-level schemas may vary from instance to instance.
Considerations
If the CSV value requires a schema, less disk space is required if the table column uses column-level schemas instead of instance-level schemas.
SQL Changes
- A DATASET type with CSV storage format is allowed everywhere a DATASET type with Avro storage format is allowed. For example, CREATE CSV SCHEMA MySchema.
- The SQL is unchanged.
Additional Information
For more information on CSV storage format, see Teradata Vantage⢠DATASET Data Type, B035-1198.