Defining a Schema - Parallel Transporter

Teradata® Parallel Transporter User Guide

Product

Parallel Transporter

Release Number

16.20

Published

August 2020

Language

English (United States)

Last Update

2020-08-27

dita:mapPath

uah1527114222342.ditamap

dita:ditavalPath

Audience_PDF_product_tpt_userguide_include.ditaval

dita:id

B035-2445

lifecycle

Product Category

Teradata Tools and Utilities

Teradata PT requires that the job script describe the structure of the data to be processed, that is the columns in table rows or fields in file records. This description is called the schema. Schemas are created using the DEFINE SCHEMA statement.

The value following the keyword SCHEMA in a DEFINE OPERATOR statement identifies the schema that the operator will use to process job data. Schemas specified in operator definitions must have been previously defined in the job script. To determine how many schemas you must define, observe the following guidelines on how and why schemas are referenced in operator definitions (except standalone operators):

The schema referenced in a producer operator definition describes the structure of the source data.
The schema referenced in a consumer operator definition describes the structure of the data that will be loaded into the target. The consumer operator schema can be coded as SCHEMA * (a deferred schema), which means that it will accept the scheme of the output data from the producer.
You can use the same schema for multiple operators.
You cannot use multiple schemas within a single operator, except in filter operators, which use two schemas (input and output).
The column names in a schema definition in a Teradata PT script do not have to match the actual column names of the target table, but their data types must match exactly.
When a Teradata PT job is processing character data in the UTF-16 character set, all CHAR(byte count), VARCHAR(byte count) , CLOB(byte count) and JSON(byte count) schema columns will have byte count values that are twice the character count values in the corresponding column Definitions of the Teradata Database table; byte count must be an even number.
When a Teradata PT job is processing character data in the UTF-8 character set, all CHAR(byte count), VARCHAR(byte count) , CLOB(byte count) and JSON(byte count) schema columns will have byte count values that are three times the character count values in the corresponding column Definitions of the Teradata Database table.
Teradata PT can automatically adjust byte count values for UTF8 and UTF16 encoding types, if the USING CHARACTER SET UTF8 or UTF16 specifier is present at the top of the script, simply use the keywords ADJUST UNICODE on the line immediately below DEFINE SCHEMA, and Teradata PT will handle doubling and tripling of byte count. See the Teradata Parallel Transporter Reference (B035-2436) for information about ADJUST UNICODE keywords, and the Teradata Parallel Transporter Reference Guide for additional information about the USING CHARACTER SET keywords.”

When using the UTF-16 character set in a job script, the value of n in VARCHAR(n) and CHAR(n) in the SCHEMA definition must be an even and positive number.

The following is an example of a schema definition:

Example Schema Definition