Schemas - Parallel Transporter

Teradata® Parallel Transporter Application Programming Interface Programmer Guide

Product
Parallel Transporter
Release Number
16.20
Published
November 2020
Language
English (United States)
Last Update
2020-11-19
dita:mapPath
ghk1527114222323.ditamap
dita:ditavalPath
Audience_PDF_include.ditaval
dita:id
B035-2516
lifecycle
previous
Product Category
Teradata Tools and Utilities

A schema object defines the format of the data being used in the load and export tasks. A schema consists of one or more columns, and the format of each column’s data. The schema lets the driver know what data to expect.

Unlike an attribute, a schema is defined as an external C++ class. When creating a schema object using the class constructor, the schema must be designated as an input or output schema. This means the task will involve either loading, updating data, or exporting data. The following is an example of creating a schema object:

Schema * schema = new Schema("input");

Columns are added to the schema object one at a time using the TD_DataType constants to designate the data type of the column. The following are examples of adding a column to a schema object:

schema->AddColumn("Associate_Name",TD_CHAR,25);
schema->AddColumn("Salary",TD_FLOAT,8);

The following figure depicts the coding lines for a two-table select statement and how the data is brought together in the answer set.

Schema Definition for a Multiple Table Select Statement

When all columns have been added to the schema object, the schema object is then passed to the Connection class. The following is an example of adding a schema object to a Connection object:

conn->AddSchema(schema);

Working with Schemas

When working with schemas, note the following:

  • Every driver requires a schema except when using either the Update driver's Delete Task with no variable substitution or the Export driver’s dynamic schema feature.
  • A Connection object may have only one schema.
  • The column names defined in the schema do not have to match the actual names of the columns in the target table(s) as they can be renamed in the SQL statements used for the job. The column names in the schema must match the names used in the job’s SQL statements.
  • The column data types defined in the schema do not have to match the actual data types of the columns in the target table(s) since they can be cast into other data types in the SQL statements used. The column data type in the schema must, however, match the data types used in the job’s SQL statements.
  • The order of the columns defined in the schema does not have to match the actual order of the columns in the target table(s) as the order can be changed in the SQL statements used. The order of the columns in the schema must, however, match the order of the columns in the job’s SQL statements.
  • When loading multiple tables using the Update or Stream driver, the input schema defined for the job is a superset of the columns in all of the tables being loaded. One or more of the columns in the schema must correlate to the columns being loaded in each of the target tables.
  • VARCHAR and VARBYTE data with a length less than or equal to the defined length in the SCHEMA object is allowed.

Examples of Schema Definitions

If the Export driver is being used to export data from Table 1 which has columns A, B, and C, and the SELECT statement is:

SELECT A FROM Table 1

then the output schema is defined as one column with a data type matching column A’s definition in Table 1. If column A is defined as a variable character with a length of 50, then the column in the output schema is defined as TD_VARCHAR with a length of 50.

Schema* schema = new Schema("output");
schema->AddColumn("A", TD_VARCHAR, 50);

Here’s another example using the same table. If the select statement is changed to:

SELECT A(CHAR(100)) FROM Table 1

then the column in the output schema is defined as TD_CHAR with a length of 100.

Schema* schema = new Schema("output");
schema->AddColumn("A", TD_CHAR, 100);