About Copying Datasets and Schemas - Teradata Data Mover

Teradata Data Mover User Guide

Product
Teradata Data Mover
Release Number
16.00
Published
December 2016
Language
English (United States)
Last Update
2018-03-29
dita:mapPath
rdo1467305237457.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-4101
lifecycle
previous
Product Category
Analytical Ecosystem
Data Mover supports copying objects associated with the Teradata Database 16.00 DATASET type. The Teradata DATASET type is a complex data type (CDT) representing self-describing files that are interpreted based on a schema. A schema could be created with the following statements:
CREATE AVRO SCHEMA SYSUDTLIB.testSchema AS '{
      "namespace": "example.avro",
      "type": "record",
      "name": "User",
      "fields": [
            {"name": "name", "type": "string"},
            {"name": "favorite_number",  "type": ["int", "null"]},
            {"name": "favorite_color", "type": ["string", "null"]}
       ]

      }';
After creation, the schema definition can be retrieved by executing a show schema-type query on the schema.
SHOW AVRO SCHEMA testSchema

Copying Schemas Using Data Mover

Data Mover copies schemas as follows:
  • Data Mover retrieves the schema creation DDL by executing a show schema-type sysdtlib.schemaname SQL statement.
  • Data Mover does not parse a create schema DDL; the exact same DDL is executed on the target system. Data Mover does not relocate or rename schema objects.
  • If copying a table with a DATASET column that references a schema, the schema must already exist on the target system or be copied with the job. Schemas are created in the SYSUDTLIB database.
  • If you copy a schema with the same name that exists on the target system and the overwrite_existing_objects tag is set to true, the schema is overwritten if no columns on the target system reference the schema.

Copying Datasets Using Data Mover

Query the column values to determine if it provides the necessary information to move the column as a DATASET column:
  • The maximum length and DT type code are included in dataset type columns.
  • The storage format column contains the following value when the column is a DATASET data type:
    "Avro": DATASET stored as Avro
  • A column entitled datasetSchemaName containing the name of the schema is associated with a particular column. If no schema is associated with a particular column, this field has a NULL value.
A dataset could be copied using the following command:
CREATE TABLE hcExampleTable(
      id INTEGER,
      avroFile DATASET(100) STORAGE FORMAT Avro WITH SCHEMA chemDatasetSchema
 
);