DATASET Data Type | Database Design | Teradata Vantage - DATASET Data Type - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
kko1591750222108.ditamap
dita:ditavalPath
kko1591750222108.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantageā„¢

The DATASET data type includes a schema and data, which can both have a variable length. You can use the INLINE LENGTH option to specify an inline storage size. When the data is smaller than or equal to the inline storage size, it is stored inside the base row. Otherwise, the data is stored as a LOB (large object).

If the data is stored inline, it is treated as a non-LOB type. In this case, the performance may be better because there is no LOB overhead. You may see some performance improvement, especially when the data type is used with UDFs.

Each specification of the DATASET data type includes the following information:

  • Maximum length
  • In-line length
  • Storage format
  • Character set (comma-separated value (CSV) storage format only)
  • Schema

Specify the STORAGE FORMAT option in the data type specification syntax. Available storage formats include Avro and CSV. The following values apply to either the schema or the data for the DATASET data type:

Storage Location Maximum Length Minimum Length Default Length
LOB 16 MB 100 bytes 16 MB
Inline 64 KB 100 bytes 10 KB

[Optional] Specify a character set for the CSV format. It can be either LATIN or UNICODE. The default is the session character set.

A schema is optional for the CSV format. You can specify a schema in any supported JSON format. For instance-level DATASET values, the schema is stored in the same character set as the CSV data type. For column-level DATASET values, it is encoded in UTF-8. The schema is null-terminated.

The CSV storage format will support extensions to the CSV standard, such as user-specified column and record delimiters and header field names. If you use any of these extensions, specify a schema. You can define schemas at the column-level or instance-level for any of the built-in storage formats of the DATASET type. Column-level schemas are binding for all instances of the data type loaded into that particular column, but instance-level schemas may vary from instance to instance.