Importing From an Avro Object Container File - Teradata Vantage NewSQL Engine - 16.20

Teradata Vantageā„¢ DATASET Data Type

prodname
Teradata Database
Teradata Vantage NewSQL Engine
vrm_release
16.20
category
Programming Reference
featnum
B035-1198-162K

The Avro specification provides an object container file format to transmit and store multiple binary-encoded Avro values with a common schema.

Because these files contain one Avro schema and one or more binary-encoded Avro values described by that schema, the data in an object container file maps to a DATASET STORAGE FORMAT AVRO column of a Teradata table with a column-based schema.

The Teradata Database provides direct support for the files via the AvroContainerSplit table operator. The following section describes a general framework to import Avro data from the files.

  1. Retrieve the schema from the object container file.
  2. Create a schema on a Teradata system using the new CREATE <storage-format-name> SCHEMA DDL statement using the schema retrieved in Step 1. Note that this schema may be specified in LATIN or UNICODE characters or as UTF-8 in its byte representation.
  3. Create a table on a Teradata system that conforms to a desired structure, and includes a DATASET STORAGE FORMAT AVRO column with a column-level schema defined using the schema created in Step 2.
  4. Run the AvroContainerSplit table operator to load the Avro DATASET values into the table created in Step 3.
These steps allow any application to import data from an object container file to a Teradata table.
If the DATASET table column is defined without a column-based schema, the schema is stored with each Avro instance in the table.