Importing from an Avro Object Container File | Teradata Vantage - Importing From an Avro Object Container File - Advanced SQL Engine - Teradata Database

DATASET Data Type

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-23
dita:mapPath
des1556232910526.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1198
lifecycle
previous
Product Category
Teradata Vantageā„¢

The Avro specification provides an object container file format to transmit and store multiple binary-encoded Avro values with a common schema.

Because these files contain one Avro schema and one or more binary-encoded Avro values described by that schema, the data in an object container file maps to a DATASET STORAGE FORMAT AVRO column of a Teradata table with a column-based schema.

The database provides direct support for the files using the AvroContainerSplit table operator. The following section describes a general framework to import Avro data from the files.

  1. Retrieve the schema from the object container file.
  2. Create a schema on a Teradata system using the new CREATE <storage-format-name> SCHEMA DDL statement using the schema retrieved in Step 1. Note that this schema may be specified in LATIN or UNICODE characters or as UTF-8 in its byte representation.
  3. Create a table on a Teradata system that conforms to a desired structure, and includes a DATASET STORAGE FORMAT AVRO column with a column-level schema defined using the schema created in Step 2.
  4. Run the AvroContainerSplit table operator to load the Avro DATASET values into the table created in Step 3.
These steps allow any application to import data from an object container file to a database table.
If the DATASET table column is defined without a column-based schema, the schema is stored with each Avro instance in the table.