CSV Storage Format for the DATASET Data Type - Teradata Database - Teradata Vantage NewSQL Engine

Teradata Vantageā„¢ NewSQL Engine Release Summary

Product
Teradata Database
Teradata Vantage NewSQL Engine
Release Number
16.20
Published
March 2019
Language
English (United States)
Last Update
2019-05-03
dita:mapPath
hqm1512077988481.ditamap
dita:ditavalPath
TD_DBS_16_20_Update1.ditaval
dita:id
B035-1098
lifecycle
previous
Product Category
Software
Teradata Vantage

The DATASET data type, introduced in Teradata Database 16.0 to support self-describing data, was designed to support multiple storage formats. In the initial release, the DATASET data type supported the Avro storage format. This enhancement adds support for the Comma Separated Value (CSV) storage format. All of the features available with the existing DATASET type (for example, dot notation, column-level schemas, functions, and so on) are supported with the new CSV storage format.

Benefits

The following is provided for the CSV storage format:

  • Variable length: Both the maximum length and the in-row length are variable.
  • Variable format of data stored: Support for CSV format is added, in addition to the existing Avro storage format.
  • Methods: Methods are provided to operate on the DATASET type in any storage format and with any schema.
  • Functions: Functions are provided to operate on the DATASET type in any storage format and with any schema.
  • Publishing to DATASET data type: Use data stored in relational tables to compose a DATASET type with any storage format and any schema.
  • Shredding: Shred a CSV data type into a relational table.
  • Converting: Convert a CSV data type into Avro or JSON.
  • Enhanced dot notation: The extended dot notation introduced in Teradata Database 16.0 is supported for the new CSV storage format.
  • CSV storage format supports some extensions to the CSV standard, such as user-specified column and record delimiters and header field names. If any of these extensions are used, a schema must be specified. Users may define schemas at the column-level or instance-level for any of the built-in storage formats of the DATASET type. Column-level schemas are binding for all instances of the data type loaded into that particular column, whereas instance-level schemas may vary from instance to instance.

Considerations

If the CSV value requires a schema, less disk space is required if the table column uses column-level schemas instead of instance-level schemas.

SQL Changes

  • A DATASET type with CSV storage format is allowed everywhere a DATASET type with Avro storage format is allowed. For example, CREATE CSV SCHEMA MySchema.
  • The SQL is unchanged.

Additional Information

For more information on CSV storage format, see Teradata Vantageā„¢ DATASET Data Type, B035-1198.