Syntax - Parallel Data Pump

Teradata® Parallel Data Pump Reference

Product
Parallel Data Pump
Release Number
17.00
Published
June 2020
Language
English (United States)
Last Update
2020-06-19
dita:mapPath
ioq1544831946920.ditamap
dita:ditavalPath
obe1474387269547.ditaval
dita:id
B035-3021
lifecycle
previous
Product Category
Teradata Tools and Utilities


where the following is true:

SESSIONS
Keyword for the number of Teradata TPump sessions.
number
Number of sessions to be logged on for update purposes for Teradata TPump.
A Teradata TPump task logs on and uses the number of sessions specified. One additional session is used for performing various utility functions.
There is no default value for number; it must be specified. Neither is there a maximum value, except for system-wide session limitations, which vary among machines.
Limiting the number of sessions conserves resources on both the external system and the database. This conservation is at the expense of a potential decrease in throughput and increase in elapsed time.
threshold
Minimum number of sessions to be logged on for update purposes for the utility.
When logging on sessions, if the limits are reached above the threshold value, Teradata TPump stops trying to log on, and uses whatever sessions are already logged on.
If the sessions run out before the threshold is reached, Teradata TPump logs off all sessions, waits for the time determined by the SLEEP value, and tries to log on again.
ERRORTABLE
Optional keyword for identifying a database and error table.
Use a database name as a qualifying prefix to the error tables. Specifying a database that is not the production database avoids cluttering the production system with error tables. This means that because the database should have been allocated a lot of PERM space, that space will not have to be increased for all databases with tables involved in the Teradata TPump task.
It is the user’s responsibility to ensure that the error table sharing was meaningful. Two jobs targeting different tables with different input record layouts cannot share the same error table.
APPEND
Specifies the existing error table.
If the specified error table does not exist, Teradata TPump will create it. If the structure of the existing error table is not compatible with the error table Teradata TPump creates, the job will run into an error when Teradata TPump tries to insert or update the error table.
NODROP
Specifies to retain (not DROP) the error table, even it is empty at the end of a job.
NODROP can be used with APPEND to persist the error table or alone.
QUEUETABLE
Selects the error table as a Queue Table.
dbname.
The qualified database for the error table.
If the database is not specified, the database which contains the log table is used. The period following dbname separates the database name from the tname parameter. If a different database is specified, it may help to avoid cluttering the production database with error tables.
tname
Error table that receives information about errors detected during the load.
tname may be preceded by a database name qualifier. This table is referred to as the error table or ET table.
TPump explicitly specifies FALLBACK as the level of protection.
The default error table name is composed of the job name, followed by an underscore and sequence number of the load, then an underscore and an ET, as in jobname_nnn_ET.
tname identifies a nonexisting table for a nonrestart task, or an existing table for a restart task.
For all errors inserted in this error table, the identifiers for the offending combination of statement and data record are included in the appropriate row of tname. The columns in the error table allow the identification of a specific data record and statement combination which produced an error. The column names and definitions of the error table are:
  • ImportSeq – A byteint containing the IMPORT command sequence number.
  • DMLSeq – A byteint containing the sequence number of the DML command within the command file.
  • SMTSeq – A byteint containing the sequence number of the DML statement within the DML command.
  • ApplySeq – A byteint containing the sequence number of the APPLY clause within its IMPORT command.
  • SourceSeq – An integer containing the position of a data record within a data source.
  • DataSeq – A byteint identifying the data source. This value is always one.
  • ErrorCode – An integer containing an error return code.
  • ErrorMsg – Contains the corresponding error message for the error code.
  • ErrorField – A smallint, which, if valid, indicates the bad field.

    The names of record fields sent to the database are specified via the LAYOUT command, in conjunction with FIELD and TABLE commands.

  • HostData - A variable length byte string containing the data sent by the external system.
  • RowInsetTime – Indicates when the row was inserted into the Teradata TPump error table.
ERRLIMIT
Optional keyword for setting a limit on records rejected for errors.
When the ERRLIMIT is exceeded, Teradata TPump performs a checkpoint, then terminates the job. The data read before ERRLIMIT was exceeded will be submitted and completed before the job is terminated. This means when a job is terminated due to ERRLIMIT was exceeded, there may be more error records in the error table than the number specified in ERRLIMIT. To facilitate diagnosis of data errors, the ERRLIMIT should be greater than the number of statements packed into one request.
errcount
Error threshold for controlling the number of rejected records. Usage depends on whether used with the errpercent parameter.
  • When used without the errpercent parameter, specifies, as an unsigned integer, the number of records that can be rejected and recorded in tname during a load (all records sent between the BEGIN LOAD and END LOAD commands). The default is no limit.
  • When used with the errpercent parameter (which is approximate), specifies the maximum number of records that must be sent to the database before the errpercent parameter is applied.
For example, if errcount = 100 and errpercent = 5, then 100 records must be sent to the database before the approximate 5 percent rejection limit is applied. If only the first five records are rejected when the 100th record is sent, the limit is not exceeded. If there are six rejections, however, then the limit is exceeded. After the 100th record is sent, Teradata TPump stops processing if the 5 percent limit has been exceeded.
When the limit has been exceeded, Teradata TPump writes an error message to the external system’s customary message destination and terminates the task.
All tables in use are left in their state at the time of the termination. This allows errors to be corrected in data records and restarting of the task from the last checkpoint. If a restart is not possible or not desired, any unwanted tables should be dropped.
CHECKPOINT
Keyword indicating the number of minutes between the occurrences of checkpoints.
This is followed by a frequency value.
frequency
The interval in minutes between check pointing operations. Specify an unsigned integer from 0 through 60, inclusive.
To specify a CHECKPOINT frequency of less than or equal to 60, a checkpoint is recorded at the specified frequency, in minutes.
To specify a CHECKPOINT frequency of more than 60, Teradata TPump terminates the job.
Specifying a CHECKPOINT frequency of zero bypasses all checkpoint functions. Teradata TPump does not do checkpoint operations at all, which is different from Teradata Tools and Utilities 07.00 and earlier. The Teradata TPump job will not be restart-able once data loading has begun.
If a CHECKPOINT frequency is not specified, check pointing occurs every 15 minutes by default.
Whether specified or not, checkpoints are written at the end of each data input source.
Checkpoints should not be set for an FDL-compatible INMOD routine with the FOR, FROM, or THRU options. When an FDL-compatible INMOD routine is used with the FOR, FROM, or THRU options, Teradata TPump terminates and an error message appears if the checkpoint frequency is other than zero.
DATA ENCRYPTION
ON/OFF
Keyword to encrypt import data and the request text during the communication between Teradata TPump and the database.
If ON, the encryption will be performed. If DATAENCRYPTION is not specified, the default is OFF.
The "-y" runtime parameter applies the encryption to all connected sessions, which include the control session and the load sessions. This option only applies the encryption to the load sessions, which are the sessions specified by the SESSIONS keyword in the BEGIN LOAD command, and overrides the "-y" runtime parameter when OFF is explicitly specified. For example, assuming the PARTITION command is not used in the job, when "-y" runtime parameter is specified with the job and DATAENCRYPTION OFF is specified in the script, the encryption will only apply to the control session. Similarly, assuming the PARTITION command is not used in the job when "-y" runtime parameter is not specified with the job, and DATAENCRYPTION ON is specified in the script, the encryption will apply to all load sessions but not the control session.
When the PARTITION command is used, the encryption setting explicitly specified in the PARTITION command will override the setting of this option over the sessions defined by the PARTITION command.
ARRAY SUPPORT
ON/OFF
“ArraySupport ON|OFF” option to the .BEGIN LOAD command and the .DML command.
When “ArraySupport ON” is specified in the .BEGIN LOAD command, the .DML commands enclosed in .BEGIN LOAD and .END LOAD command pair will use the ArraySupport feature for its DML statement, unless “ArraySupport OFF” is specified for the .DML command. The default value of ArraySupport for the .BEGIN LOAD command is OFF.
When “ArraySupport ON|OFF” is not specified with the .DML command, the default value for ArraySupport for that .DML command is the effective setting of ArraySupport in the .BEGIN LOAD command where the .DML command resides. When “ArraySupport ON|OFF” is specified at the .DML command, the specified value overrides the default setting determined by the .BEGIN LOAD command.
When a .DML command is using the ArraySupport feature (defined in the .BEGIN LOAD command or in the .DML command), it must contain one and only one DML statement, and the session partition that the .DML command references needs to be used by this .DML command exclusively.
If the DML statement is an UPSERT-type statement, it can be specified as a pair of INSERT/UPDATE statements with DO INSERT FOR MISSING UPDATE clause. Teradata TPump will create its equivalent form of UPDATE … ELSE INSERT …, example Atomic Upsert, and use it as the actual DML statement. Or an UPDATE … ELSE INSERT … statement can be directly specified with DO INSERT FOR MISSING UPDATE clause.
The non-atomic form of UPSERT is not supported by Teradata TPump Array Support.
TENACITY
Keyword (with hours parameter) defining how long the utility tries to log on the sessions needed to perform the Teradata TPump job.
If a logon is denied, Teradata TPump delays for the time specified by the SLEEP parameter (the default is six minutes) and retries the logon. It retries until either the logon succeeds or the number of hours specified by TENACITY is exceeded.
If the TENACITY parameter is not specified, the utility retries the logons for four hours.
hours
Teradata TPump tenacity factor, as an integral number of hours. Specifies how long Teradata TPump keeps trying to logon to the required sessions.
The default value for hours is 4 if the parameter is not specified. If hours is specified as 0, Teradata TPump does not retry logons after a logon fails because of a capacity limit.
When a “no more sessions” error appears (either a 301 return code from a workstation CLI or a 513 return code from a mainframe CLI), Teradata TPump drops the sessions already acquired, and terminates the job without trying another logon.
LATENCY
Keyword for flushing stale buffers.
When using the Teradata TPump latency option with Teradata Access Module for Named Pipes, need_full_block = no option should be added in the Named Pipes Access Module initialization string.
seconds
Flushing threshold based on number of seconds oldest record has resided in buffer.
LATENCY cannot be less than one second.
If the SERIALIZE parameter is set to OFF, only the current buffer can possibly be stale. If SERIALIZE is ON, the number of stale buffers can range from zero to the number of sessions.
NOTIMERPROCESS
Keyword to tell Teradata TPump not to fork a child process as a timer process.
When a child process is forked, the SIGUSR2 signal notifies the parent process when the latency period expires. When a child process is not forked, the SIGALRM signal notifies the Teradata TPump process when the latency period expires. A child process is necessary for the latency function to work properly on the UNIX platforms when the MQSeries Access Module is used.
minutes
Number of minutes to wait between unsuccessful logon attempts due to session limits errors on the database or CLIv2.
If SLEEP is not specified, the default between unsuccessful logon attempts is 6 minutes.
SERIALIZE ON/OFF
Keyword to set the state (ON/OFF) of the serialization feature which, if ON, guarantees that operations on a given key combination (row) occur serially.
If SERIALIZE is not specified, the default is OFF.
This feature is meaningful only when a primary key for the loaded data is specified by using the KEY option of the FIELD command.
Serialization must be set to OFF if the .TABLE command is used and the target table is a NoPI table.
To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an explicit value if there are upserts in a Teradata TPump job.
PACKMAXIMUM
Keyword requesting Teradata TPump to dynamically determine the maximum possible PACK factor for the current load.
Maximum value is 1500.
Displayed in message UTY6652, the value thus determined should be specifically used on subsequent runs, as the use of PACKMAXIMUM requires iterative interactions with the database during initialization to heuristically determine the maximum possible PACK factor.
Since the maximum packing factor has changed from 600 to 1500, it will take a longer time for Teradata TPump to test the packing factor if PACKMAXIMUM is specified.
PACK
Keyword for the number of statements to pack into a multiple-statement request.
Maximum value is 1500.
Packing improves network/channel efficiency by reducing the number of sends and receives between the application and the database.
statements
Number of statements, as a positive integer of up to 1500, to pack into a multiple-statement request.
Default value is 20 statements per request.
Under certain conditions, Teradata TPump may determine that the pack factor has been set too high. Teradata TPump then automatically lowers the pack setting to an appropriate value and issues warning message UTY6625, for instance:
“UTY6625 WARNING: Packing has been changed to 12 statements per request” and continues.
Packing improves network/channel efficiency by reducing the number of sends/receives between the application and the database.
The packing factor is validated by sending a fully packed request to the database using a prepare. This test checks for syntax problems and requests that are excessively large and may overwhelm the parser.
To simplify the script development process, Teradata TPump ignores certain errors returned by an overloaded parser, shrinks the request, retries the prepare until it executes successfully and finally, issues a warning noting the revised packing factor size.
When this happens, the Teradata TPump script should be modified to eliminate the warning, thereby avoiding the time-consuming process of shrinking the request.
A packing failure may occur if the source parcel length does not match the data defined. If this happens, Teradata TPump issues the message:
“UTY2819 WARNING: Packing may fail because input data does not match with the data defined.”
To resolve this problem, increase the packing factor and resubmit the job.
RATE
Keyword for entering the rate at which statements are sent to the database.
RETRYTIMES nn
Keyword for retry times number of retry times.
Default is 16.
If nn equals 0, the retry times will be set to 16. If retrytimes is set, this only takes effect for the requests/data between "BEGIN LOAD" and "END LOAD" pair.
statement_rate
Initial maximum rate at which statements are sent to the database per minute.
The statement rate must be a positive integer. If the statement rate is unspecified, the rate is unlimited.
If the statement_rate is less than the statement packing factor, Teradata TPump sends requests smaller than the packing factor.
If the Teradata TPump Monitor is in use, the statement_rate can be changed later on.
SLEEP
Keyword for the number of minutes to sleep.
NOATOMICUPSERT
Keyword to perform non-atomic upsert operations for UPSERT DMLs in the job script if these UPSERT DMLs are not provided in the Atomic UPSERT form.
NOMONITOR
Keyword to prevent Teradata TPump from checking for statement rate changes from, or update status information for, the Teradata TPump Monitor.
ROBUST ON/OFF
The ON parameter ensures data integrity when target tables have identity columns.
The OFF parameter signals Teradata TPump to use simple restart logic. In this case, restarts cause Teradata TPump to begin where the last checkpoint occurred in a job. Any processing that occurred after the checkpoint is redone. This method does not have the extra overhead of the additional database writes in the Robust logic.
Certain errors may cause reprocessing, resulting in extra error table rows due to re-executing statements (attempting to re-insert rows, for example). Or, if the target table allows duplicate rows, re-executing statements may cause extra duplicate rows to be inserted into the target table instead of causing extra error table rows.
Simple logic is adequate in certain DML statements that can be repeated without changing the results of the operation. Examples of statements that are not simple logic include the following:
  • INSERTs into tables that allow duplicate rows (MULTISET tables).
  • Self-referencing DML statements such as:
    • “UPDATE FOO SET A=A+1...”
    • “UPDATE FOO SET A = 3 WHERE A=4”
MACRODB
Keyword for database to contain any macros used by Teradata TPump.
dbname
Name of database which is to contain any macros built/used by Teradata TPump.
This database overrides the default placement of macros into the database which contains the log restart table.
NOTIFY
Teradata TPump implementation of the notify user exit option:
  • NOTIFY OFF suppresses the notify user exit option.
  • NOTIFY LOW enables the notify user exit option for those events signified by “Yes” in the Low Notification Level column of the following table.
  • NOTIFY MEDIUM enables the notify user exit option for the most significant events, as specified by “Yes” in the Medium Notification Level column of the following table.
  • NOTIFY HIGH enables the notify user exit option for every Teradata TPump event that involves an operational decision point, as specified by “Yes” in the High Notification Level column of the following table.
  • NOTIFY ULTRA enables the notify user exit option for every Teradata TPump event that involves an operational decision point, as specified by “Yes” in the ULTRA Notification Level column of the following table.
  • Beginning with Teradata Tools and Utilities 14.00, TPump terminates with a fatal error if any of NOTIFY error conditions occur.
EXIT name
Keyword phrase that calls a user-defined exit where name is the name of a user-supplied library with a member name of _dynamn.
The exit must be written in C, or in a language with a runtime environment that is compatible with C.
On some versions of UNIX systems, ./prefix characters may have to be added to the EXIT name specification if the module is in the current directory.
TEXT 'string'
User-supplied string of up to 80 characters that Teradata TPump passes to the named exit routine.
The string specification must be enclosed in single quote characters (').
MSG 'string'
User-supplied string of up to 16 characters that Teradata TPump logs to:
  • The operator’s console for mainframe-attached z/OS client systems
  • The system log for workstation-attached client systems for UNIX and Windows systems
The string specification must be enclosed in single quote characters (').
EXIT64 name
Keyword phrase that calls a user-defined exit where name is the name of a user-supplied library with a member name of _dynamn.
When EXIT64 keyword is presented, the related row counters use 8-byte unsigned integer instead of 4-byte unsigned integer.
EXITON name
Keyword phrase that calls a user-defined exit where name is the name of a user-supplied library with a member name of _dynamn.
The exit must be written in C, or in a language with a runtime environment that is compatible with C. On some versions of UNIX systems, ./prefix characters may have to be added to the EXITEON name specification if the module is in the current directory.
The keyword EXITEON automatically supports the keyword EXIT64.

The following table lists events that create notifications.

Events that Create Notifications
Notification Level
Event Low Medium High Ultra Signifies
Checkpoint Begin No No Yes Yes Teradata TPump started a checkpoint.
Checkpoint End No No Yes Yes Teradata TPump successfully completed a checkpoint.
CLIv2 Error Yes Yes Yes Yes Teradata TPump received a CLIv2 error.
Database Error Yes Yes Yes Yes A database error that terminates Teradata TPump.
DML Error No No Yes Yes Teradata TPump is about to log a DML error to the error table.
Error Table No No Yes Yes Successful processing of the SEL COUNT(*) request for the error table.
Exit Yes Yes Yes Yes Teradata TPump completed a load task.
File or INMOD Open No No Yes Yes Successful processing of the IMPORT command.
Import Begin No No Yes Yes Teradata TPump is about to start reading records.
Import End No No Yes Yes Last record has been read.
Initialize Yes Yes Yes Yes Successful processing of the Notify option (BEGIN LOAD command).
InitializeEON Yes Yes Yes Yes Successful processing of the Notify option (BEGIN LOAD command).
Interim Run Statistics No No No Yes Teradata TPump is about to update the Monitor Interface table, or Teradata TPump successfully completed a checkpoint, or an Import has just completed successfully.
Table Statistics No Yes Yes Yes Teradata TPump has successfully written the table statistics.
Database Restart No No No Yes Teradata TPump received a crash error from the database or CLIv2.