Script Writing Guidelines
The following script writing guidelines will help write a Teradata TPump job script:
A script may contain up to 100 IMPORTs (tasks), delimited by a leading BEGIN LOAD
command and a trailing END LOAD command.
The BEGIN LOAD command specifies the number of sessions and establishes a number of
controlling parameters.
The BEGIN LOAD command also specifies the error table, and is the only table specified.
An optional qualifying database name may also be specified. This database name may
be different from the database being modified, thus allowing tables to be created
and dropped with no impact on the production database.
In addition, the BEGIN LOAD command establishes acceptable threshold levels for important
task controls, such as number and percentage of errors, session limits, duration of
logon attempts in hours (tenacity), and checkpointing frequency. This command also
provides optional controls to:
determine where any macros are placed
guarantee serial operations on given rows
select the number of statements to pack into a multipleāstatement request
select a restart logic mode
The next item appearing in a script is usually a description of the records in the
external file containing the change data for the target tables. The description of
these input records appears in a sequence of commands headed by the LAYOUT command.
The LAYOUT command tags the record layout being depicted with a unique name, which
is then referenced by subsequent script commands in tasks throughout the rest of the
job. The LAYOUT is followed by the supporting information contained in the sequence
of one or more FIELD, FILLER, and TABLE commands.
Each FIELD command describes a single data item occupying a column in the input row.
These items are described by data type, starting position, length, and several other
characteristics. The FIELD command is used only for those items (columns) relevant
to the current task, which are to be sent to Teradata Database as changes to the target
table.
The FIELD command may include the KEY modifier if the column is to be considered part
of the primary index for purposes of serialization.
Each FILLER command describes a column in the input row in the same way as the FIELD
command. These FILLER fields are never sent to Teradata Database. The FILLER command,
however, identifies those columns which should not be sent to Teradata Database. Thus,
if a sequence of 10 alternating FIELD and FILLER commands is used to describe 10 contiguous
columns in the row, every other column, a total of five columns, would be sent to
Teradata Database.
The TABLE command identifies any existing table with the same layout as the input.
The TABLE command is used when the changes are being enacted on entire rows, rather
than selected columns.
The next entry in the script is the DML command, which is followed by the DML statements
INSERT, UPDATE, and DELETE. The DML command creates an identifying label for the DML
statement input, which immediately follows the command. The DML command also defines
an error handling process for handling missing and duplicate rows, with respect to
the error table.
The three DML statements (INSERT, UPDATE, and DELETE) follow the DML command, and
may occur in any order and in any quantity. The INSERT statement is used to place
a complete and entirely new row into the target table.
The UPDATE statement takes the data contents from columns in the input record, as
defined with the LAYOUT, FIELD, FILLER command sequence, and substitutes the data
into the target table. The UPDATE rows are selected based on criteria specified in
a conditional clause in the statement.
The DML command also allows UPDATE and INSERT statements to be paired to provide Teradata
TPump with an upsert capability. This allows Teradata TPump, in a single pass, to
attempt an UPDATE and, if it fails, perform an INSERT on the same row.
The DELETE statement removes entire rows from the target table whenever the evaluation
of the deleting condition is true, as specified in a conditional clause in the statement.
The only information not yet provided in the task is the identity of the input data
file, the starting and ending records in the file that are being used in this task,
and other related information. This is done with the IMPORT command. This command
basically tells the Teradata TPump utility to bring in file X, from record A through
record N, to associate the layout name (and specifications) with the input records,
and to apply the desired DML (INSERT, UPDATE, and DELETE) statement to each record.
The last command in the script is the END LOAD command. This command flags the end
of the commands and statements for the task, and triggers the program to begin execution
of the task.
For compatibility with the MultiLoad utility, multiple IMPORTs (up to 100) are allowed
within a single BEGIN/END LOAD pair. However, because Teradata TPump does have an
apply phase, there is no significant difference between a script containing 100 BEGIN/END
LOAD pairs, each with one IMPORT, and a script with one BEGIN/END LOAD pair and 100
IMPORTs.