Dataset Schemas - Teradata Studio

Teradata Studio User Guide

Product
Teradata Studio
Release Number
16.00
Published
March 2017
Language
English (United States)
Last Update
2018-03-29
dita:mapPath
hop1484765174877.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2041
lifecycle
previous
Product Category
Teradata Tools and Utilities

Teradata Database 16.00 supports a new data type called DATASET. A DATASET is a Custom Data Type (CDT) used to represent self-describing data stored in a format that conforms to a schema. Thus a DATASET has an associated schema that can be included along with the column data or referenced. Schemas are created via a CREATE SCHEMA statement and stored in the SYSUDTLIB database. Schema information is stored in the DBC.DatasetSchemaInfo table.

Schemas are defined in a storage format. Currently, Apache Avro is the only storage format supported. Avro is a data serialization framework that uses JSON for defining data types and protocols and serializes data in a compact binary format. It is used primarily in Apache Hadoop to provide a serialization format for persistent data and a wire format for communication between Hadoop nodes and from client programs to the Hadoop services. Schemas can also include CSV defined parameters.