Teradata TPump Teradata SQL Statements
The following Teradata SQL statements supported by Teradata TPump are included in this chapter because they require special considerations for use with Teradata TPump. They are used for loading purposes and for creating Teradata TPump macros. The syntax and use of these Teradata SQL statements is described in detail in this chapter.
Table 25 lists the Teradata SQL Statements supported by Teradata TPump.
Statement |
Definition |
DATABASE |
Changes the default database qualification for all DML statements. |
DELETE |
Removes specified rows from a table. |
EXECUTE |
Specifies a user‑created (predefined) macro for execution. The macro named in this statement resides in Teradata Database and specifies the type of DML statement (INSERT, UPDATE, or DELETE) being handled by the macro. |
INSERT |
Adds new rows to a table by directly specifying the row data to be inserted. |
UPDATE Statement and Atomic Upsert |
Changes field values in existing rows of a table. |
ACCEPT
Purpose
The ACCEPT command accepts data types and values from an external source and uses them to set one or more utility variables. The ACCEPT command is a valid command preceding LOGON and LOGTABLE commands.
Syntax
where
Syntax Element |
Description |
var |
Name of the utility variable that is to be set with the value accepted from the designated source Character string values appear as quoted strings in the data file. |
env_var |
Environment variable that provides the value for the specified utility variables (var) |
fileid |
Data source of the external system. The external system DD (or similar) statement specifies a file. UNIX and Windows Systems infilename (the path name for a file). If the path name has embedded white space characters, enclose the entire path name in single or double quotes. If the path name is enclosed by single quotation marks and there is an embedded single quotation mark, then an escape character (single quotation mark) needs to precede the embedded single quotation mark. Likewise, if the path name is enclosed by double quotation marks and there is an embedded double quotation mark, then an escape character (double quotation mark) needs to precede the embedded double quotation mark. If the path name contains single or double quotation marks, it should be enclosed in either single or double quotation marks. z/OS a true DDNAME. If DDNAME is specified, Teradata TPump reads data records from the specified source. A DDNAME must obey the same rules for its construction as Teradata SQL column names, except that: The DDNAME must also obey the applicable rules of the external system and may reference a sequential or VSAM data set. If DDNAME represents a data source on magnetic tape, the tape may be either labelled or nonlabelled (if the operating system supports it). |
charpos1 and charpos2 |
Start and end character positions of a field in each input record which contains extraneous information Teradata TPump ignores the specified field(s) as follows: |
Usage Notes
A single record, row, or input line is accepted from the designated source. Ensure that there is only one record in the file from which the ACCEPT command is getting the variables.
If multiple variables are coded, each is sequentially assigned input text up to the first white space character encountered that is not within a quoted string.
Input text for numeric values must be delimited only by white space or record boundaries. Input text for character strings must be enclosed in apostrophes. For example:
.Accept age, name from file info;
The data record provided to satisfy the preceding ACCEPT should include two fields. The following example shows two sample data records, where the first is correct but the next is not:
32 'Tom' /* This line contains valid data. */
32 Tom /* Tom is invalid data. */
An additional method of placing comments in input text is as follows:
32 'Tom'; This line contains valid data.
When the number of variables listed is greater than the number of responses available, unused variables remain undefined (NULL). If there are not enough variables to hold all responses, a warning message is issued. If the input source is a file, the next record (starting with the first) of the file is always retrieved.
BEGIN LOAD
Purpose
The BEGIN LOAD command initiates or restarts a Teradata TPump task, specifying the number of sessions to use and any other parameters needed to execute the task.
Syntax
where
Syntax Element |
Description |
||
SESSIONS |
Keyword for the number of Teradata TPump sessions. |
||
number |
Number of sessions to be logged on for update purposes for Teradata TPump. A Teradata TPump task logs on and uses the number of sessions specified. One additional session is used for performing various utility functions. There is no default value for number; it must be specified. Neither is there a maximum value, except for system‑wide session limitations, which vary among machines. Limiting the number of sessions conserves resources on both the external system and Teradata Database. This conservation is at the expense of a potential decrease in throughput and increase in elapsed time. |
||
threshold |
Minimum number of sessions to be logged on for update purposes for the utility. When logging on sessions, if the limits are reached above the threshold value, Teradata TPump stops trying to log on, and uses whatever sessions are already logged on. If the sessions run out before the threshold is reached, Teradata TPump logs off all sessions, waits for the time determined by the SLEEP value, and tries to log on again. |
||
ERRORTABLE |
Optional keyword for identifying a database and error table. Use a database name as a qualifying prefix to the error tables. Specifying a database that is not the production database avoids cluttering the production system with error tables. This means that because the database should have been allocated a lot of PERM space, that space will not have to be increased for all databases with tables involved in the Teradata TPump task.
|
||
APPEND |
Specifies the existing error table. If the specified error table does not exist, Teradata TPump will create it. If the structure of the existing error table is not compatible with the error table Teradata TPump creates, the job will run into an error when Teradata TPump tries to insert or update the error table. |
||
NODROP |
Specifies to retain (not DROP) the error table, even it is empty at the end of a job. NODROP can be used with APPEND to persist the error table or alone. |
||
QUEUETABLE |
Selects the error table as a Queue Table. |
||
dbname. |
The qualified database for the error table. If the database is not specified, the database which contains the log table is used. The period following the dbname separates the database name from the tname parameter. If a different database is specified, it may help to avoid cluttering the production database with error tables. |
||
tname |
Error table that receives information about errors detected during the load. tname may be preceded by a database name qualifier. This table is referred to as the error table or ET table. TPump explicitly specifies FALLBACK as the level of protection. The default error table name is composed of the job name, followed by an underscore and sequence number of the load, then an underscore and an ET, as in jobname_nnn_ET. tname identifies a nonexisting table for a nonrestart task, or an existing table for a restart task. For all errors inserted in this error table, the identifiers for the offending combination of statement and data record are included in the appropriate row of tname. The columns in the error table allow the identification of a specific data record and statement combination which produced an error. The column names and definitions of the error table are: ImportSeq – A byteint containing the IMPORT command sequence number. DMLSeq – A byteint containing the sequence number of the DML command within the command file. SMTSeq – A byteint containing the sequence number of the DML statement within the DML command. ApplySeq – A byteint containing the sequence number of the APPLY clause within its IMPORT command. SourceSeq – An integer containing the position of a data record within a data source. DataSeq – A byteint identifying the data source. This value is always one. ErrorCode – An integer containing an error return code. ErrorMsg – Contains the corresponding error message for the error code. ErrorField – A smallint, which, if valid, indicates the bad field. The names of record fields sent to Teradata Database are specified via the LAYOUT command, in conjunction with FIELD and TABLE commands. HostData ‑ A variable length byte string containing the data sent by the external system. RowInsetTime – Indicates when the row was inserted into the Teradata TPump error table. |
||
ERRLIMIT |
Optional keyword for setting a limit on records rejected for errors. When the ERRLIMIT is exceeded, Teradata TPump performs a checkpoint, then terminates the job. The data read before ERRLIMIT was exceeded will be submitted and completed before the job is terminated. This means when a job is terminated due to ERRLIMIT was exceeded, there may be more error records in the error table than the number specified in ERRLIMIT. To facilitate diagnosis of data errors, the ERRLIMIT should be greater than the number of statements packed into one request. |
||
errcount |
Error threshold for controlling the number of rejected records. Usage depends on whether used with the errpercent parameter. For example, if errcount = 100 and errpercent = 5, then 100 records must be sent to Teradata Database before the approximate 5 percent rejection limit is applied. If only the first five records are rejected when the 100th record is sent, the limit is not exceeded. If there are six rejections, however, then the limit is exceeded. After the 100th record is sent, Teradata TPump stops processing if the 5 percent limit has been exceeded. When the limit has been exceeded, Teradata TPump writes an error message to the external system’s customary message destination and terminates the task. All tables in use are left in their state at the time of the termination. This allows errors to be corrected in data records and restarting of the task from the last checkpoint. If a restart is not possible or not desired, any unwanted tables should be dropped. |
||
CHECKPOINT |
Keyword indicating the number of minutes between the occurrences of checkpoints. This is followed by a frequency value. |
||
frequency |
The interval in minutes between check pointing operations. Specify an unsigned integer from 0 through 60, inclusive. To specify a CHECKPOINT frequency of less than or equal to 60, a checkpoint is recorded at the specified frequency, in minutes. To specify a CHECKPOINT frequency of more than 60, Teradata TPump terminates the job. Specifying a CHECKPOINT frequency of zero bypasses all checkpoint functions. Teradata TPump does not do checkpoint operations at all, which is different from Teradata Tools and Utilities 07.00 and earlier. The Teradata TPump job will not be restart-able once data loading has begun. If a CHECKPOINT frequency is not specified, check pointing occurs every 15 minutes by default. Whether specified or not, checkpoints are written at the end of each data input source. Note: Checkpoints should not be set for an FDL‑compatible INMOD routine with the FOR, FROM, or THRU options. When an FDL‑compatible INMOD routine is used with the FOR, FROM, or THRU options, Teradata TPump terminates and an error message appears if the checkpoint frequency is other than zero. |
||
DATAENCRYPTION ON/OFF |
Keyword to encrypt import data and the request text during the communication between Teradata TPump and Teradata Database. If ON, the encryption will be performed. If DATAENCRYPTION is not specified, the default is OFF. The "‑y" runtime parameter applies the encryption to all connected sessions, which include the control session and the load sessions. This option only applies the encryption to the load sessions, which are the sessions specified by the SESSIONS keyword in the BEGIN LOAD command, and overrides the "‑y" runtime parameter when OFF is explicitly specified. For example, assuming the PARTITION command is not used in the job, when "‑y" runtime parameter is specified with the job and DATAENCRYPTION OFF is specified in the script, the encryption will only apply to the control session. Similarly, assuming the PARTITION command is not used in the job when "‑y" runtime parameter is not specified with the job, and DATAENCRYPTION ON is specified in the script, the encryption will apply to all load sessions but not the control session. When the PARTITION command is used, the encryption setting explicitly specified in the PARTITION command will override the setting of this option over the sessions defined by the PARTITION command. |
||
ArraySupport |
“ArraySupport ON|OFF” option to the .BEGIN LOAD command and the .DML command. When “ArraySupport ON” is specified in the .BEGIN LOAD command, the .DML commands enclosed in .BEGIN LOAD and .END LOAD command pair will use the ArraySupport feature for its DML statement, unless “ArraySupport OFF” is specified for the .DML command. The default value of ArraySupport for the .BEGIN LOAD command is OFF. When “ArraySupport ON|OFF” is not specified with the .DML command, the default value for ArraySupport for that .DML command is the effective setting of ArraySupport in the .BEGIN LOAD command where the .DML command resides. When “ArraySupport ON|OFF” is specified at the .DML command, the specified value overrides the default setting determined by the .BEGIN LOAD command. When a .DML command is using the ArraySupport feature (defined in the .BEGIN LOAD command or in the .DML command), it must contain one and only one DML statement, and the session partition that the .DML command references needs to be used by this .DML command exclusively. If the DML statement is an UPSERT‑type statement, it can be specified as a pair of INSERT/UPDATE statements with DO INSERT FOR MISSING UPDATE clause. Teradata TPump will create its equivalent form of UPDATE … ELSE INSERT …, example Atomic Upsert, and use it as the actual DML statement. Or an UPDATE … ELSE INSERT … statement can be directly specified with DO INSERT FOR MISSING UPDATE clause. The non‑atomic form of UPSERT is not supported by Teradata TPump Array Support. |
||
TENACITY |
Keyword (with hours parameter) defining how long the utility tries to log on the sessions needed to perform the Teradata TPump job. If a logon is denied, Teradata TPump delays for the time specified by the SLEEP parameter (the default is six minutes) and retries the logon. It retries until either the logon succeeds or the number of hours specified by TENACITY is exceeded. If the TENACITY parameter is not specified, the utility retries the logons for four hours. |
||
hours |
Teradata TPump tenacity factor, as an integral number of hours. Specifies how long Teradata TPump keeps trying to logon to the required sessions. The default value for hours is 4 if the parameter is not specified. If hours is specified as 0, Teradata TPump does not retry logons after a logon fails because of a capacity limit. When a “no more sessions” error appears (either a 301 return code from a workstation CLI or a 513 return code from a mainframe CLI), Teradata TPump drops the sessions already acquired, and terminates the job without trying another logon. |
||
LATENCY |
Keyword for flushing stale buffers. Note: When using the Teradata TPump latency option with Named Pipe Access Module, need_full_block = no option should be added in the Named Pipe Access Module initialization string. |
||
seconds |
Flushing threshold based on number of seconds oldest record has resided in buffer. LATENCY cannot be less than one second. If the SERIALIZE parameter is set to OFF, only the current buffer can possibly be stale. If SERIALIZE is ON, the number of stale buffers can range from zero to the number of sessions. |
||
NOTIMERPROCESS |
Keyword to tell Teradata TPump not to fork a child process as a timer process. When a child process is forked, the SIGUSR2 signal notifies the parent process when the latency period expires. When a child process is not forked, the SIGALRM signal notifies the Teradata TPump process when the latency period expires. A child process is necessary for the latency function to work properly on the UNIX platforms when the MQSeries Access Module is used. |
||
minutes |
Number of minutes to wait between unsuccessful logon attempts due to session limits errors on Teradata Database or CLIv2. If SLEEP is not specified, the default between unsuccessful logon attempts is 6 minutes. |
||
SERIALIZE ON/OFF |
Keyword to set the state (ON/OFF) of the serialization feature which, if ON, guarantees that operations on a given key combination (row) occur serially. If SERIALIZE is not specified, the default is OFF. This feature is meaningful only when a primary key for the loaded data is specified by using the KEY option of the FIELD command. Serialization must be set to OFF if the .TABLE command is used and the target table is a NoPI table. To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an explicit value if there are upserts in a Teradata TPump job. |
||
PACKMAXIMUM |
Keyword requesting Teradata TPump to dynamically determine the maximum possible PACK factor for the current load. Maximum value is 1500. Displayed in message UTY6652, the value thus determined should be specifically used on subsequent runs, as the use of PACKMAXIMUM requires iterative interactions with the database during initialization to heuristically determine the maximum possible PACK factor. Note: Since the maximum packing factor has changed from 600 to 1500, it will take a longer time for Teradata TPump to test the packing factor if PACKMAXIMUM is specified. |
||
PACK |
Keyword for the number of statements to pack into a multiple‑statement request. Maximum value is 1500. Packing improves network/channel efficiency by reducing the number of sends and receives between the application and Teradata Database. |
||
statements |
Number of statements, as a positive integer of up to 1500, to pack into a multiple‑statement request. Default value is 20 statements per request. Under certain conditions, Teradata TPump may determine that the pack factor has been set too high. Teradata TPump then automatically lowers the pack setting to an appropriate value and issues warning message UTY6625, for instance: “UTY6625 WARNING: Packing has been changed to 12 statements per request” and continues. Packing improves network/channel efficiency by reducing the number of sends/receives between the application and the database. The packing factor is validated by sending a fully packed request to Teradata Database using a prepare. This test checks for syntax problems and requests that are excessively large and may overwhelm the parser. To simplify the script development process, Teradata TPump ignores certain errors returned by an overloaded parser, shrinks the request, retries the prepare until it executes successfully and finally, issues a warning noting the revised packing factor size. When this happens, the Teradata TPump script should be modified to eliminate the warning, thereby avoiding the time‑consuming process of shrinking the request. A packing failure may occur if the source parcel length does not match the data defined. If this happens, Teradata TPump issues the message: “UTY2819 WARNING: Packing may fail because input data does not match with the data defined.” To resolve this problem, increase the packing factor and resubmit the job. |
||
RATE |
Keyword for entering the rate at which statements are sent to Teradata Database. |
||
RETRYTIMES nn |
Keyword for retry times number of retry times. Default is 16. If nn equals 0, the retry times will be set to 16. If retrytimes is set, this only takes effect for the requests/data between "BEGIN LOAD" and "END LOAD" pair. |
||
statement_rate |
Initial maximum rate at which statements are sent to Teradata Database per minute. The statement rate must be a positive integer. If the statement rate is unspecified, the rate is unlimited. If the statement_ rate is less than the statement packing factor, Teradata TPump sends requests smaller than the packing factor. If the Teradata TPump Monitor is in use, the statement_rate can be changed later on. |
||
SLEEP |
Keyword for the number of minutes to sleep. |
||
NOATOMICUPSERT |
Keyword to perform non‑atomic upsert operations for UPSERT DMLs in the job script if these UPSERT DMLs are not provided in the Atomic UPSERT form. |
||
NOMONITOR |
Keyword to prevent Teradata TPump from checking for statement rate changes from, or update status information for, the Teradata TPump Monitor. |
||
ROBUST ON/OFF |
The ON parameter ensures data integrity when target tables have identity columns. The OFF parameter signals Teradata TPump to use simple restart logic. In this case, restarts cause Teradata TPump to begin where the last checkpoint occurred in a job. Any processing that occurred after the checkpoint is redone. This method does not have the extra overhead of the additional database writes in the Robust logic.
Simple logic is adequate in certain DML statements that can be repeated without changing the results of the operation. Examples of statements that are not simple logic include the following: |
||
MACRODB |
Keyword for database to contain any macros used by Teradata TPump. |
||
dbname |
Name of database which is to contain any macros built/used by Teradata TPump. This database overrides the default placement of macros into the database which contains the log restart table. |
||
NOTIFY |
Teradata TPump implementation of the notify user exit option: |
||
EXIT name |
Keyword phrase that calls a user‑defined exit where name is the name of a user‑supplied library with a member name of _dynamn. The exit must be written in C, or in a language with a runtime environment that is compatible with C. On some versions of UNIX systems, ./prefix characters may have to be added to the EXIT name specification if the module is in the current directory. |
||
TEXT 'string' |
User‑supplied string of up to 80 characters that Teradata TPump passes to the named exit routine. The string specification must be enclosed in single quote characters ('). |
||
MSG 'string' |
User‑supplied string of up to 16 characters that Teradata TPump logs to: The string specification must be enclosed in single quote characters ('). |
||
EXIT64 name |
Keyword phrase that calls a user-defined exit where name is the name of a user-supplied library with a member name of _dynamn. When EXIT64 keyword is presented, the related row counters use 8-byte unsigned integer instead of 4-byte unsigned integer. |
||
EXITON name |
Keyword phrase that calls a user-defined exit where name is the name of a user-supplied library with a member name of _dynamn. The exit must be written in C, or in a language with a runtime environment that is compatible with C. On some versions of UNIX systems, ./prefix characters may have to be added to the EXITEON name specification if the module is in the current directory. The keyword EXITEON automatically supports the keyword EXIT64. |
Table 26 lists events that create notifications.
|
Notification Level |
|
|||
Event |
Low |
Medium |
High |
Ultra |
Signifies |
Checkpoint Begin |
No |
No |
Yes |
Yes |
Teradata TPump started a checkpoint. |
Checkpoint End |
No |
No |
Yes |
Yes |
Teradata TPump successfully completed a checkpoint. |
CLIv2 Error |
Yes |
Yes |
Yes |
Yes |
Teradata TPump received a CLIv2 error. |
Database Error |
Yes |
Yes |
Yes |
Yes |
A Teradata Database error that terminates Teradata TPump. |
DML Error |
No |
No |
Yes |
Yes |
Teradata TPump is about to log a DML error to the error table. |
Error Table |
No |
No |
Yes |
Yes |
Successful processing of the SEL COUNT(*) request for the error table. |
Exit |
Yes |
Yes |
Yes |
Yes |
Teradata TPump completed a load task. |
File or INMOD Open |
No |
No |
Yes |
Yes |
Successful processing of the IMPORT command. |
Import Begin |
No |
No |
Yes |
Yes |
Teradata TPump is about to start reading records. |
Import End |
No |
No |
Yes |
Yes |
Last record has been read. |
Initialize |
Yes |
Yes |
Yes |
Yes |
Successful processing of the Notify option (BEGIN LOAD command) |
InitializeEON |
Yes |
Yes |
Yes |
Yes |
Successful processing of the Notify option (BEGIN LOAD command) |
Interim Run Statistics |
No |
No |
No |
Yes |
Teradata TPump is about to update the Monitor Interface table, or Teradata TPump successfully completed a checkpoint, or an Import has just completed successfully. |
Table Statistics |
No |
Yes |
Yes |
Yes |
Teradata TPump has successfully written the table statistics. |
Teradata Database Restart |
No |
Yes |
Yes |
Yes |
Teradata TPump received a crash error from Teradata or CLI. |
Usage Notes
Multiple tables can be targeted by a single Teradata TPump job.
If the script author is uncertain whether or not to use ROBUST restart logic, it is always safe to use the ROBUST ON parameter.
To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an explicit value if there are upserts in the Teradata TPump job.
The statement rate per minute you set using the RATE keyword is also affected by the periodicity value. By default, Teradata TPump uses a periodicity value of four when enforcing the statement rate limit. You can adjust the periodicity rate from 1 to 1500 using a runtime parameter.
For example, if you set the statement rate at 1600 and the periodicity at 10, then the maximum number of statements processed is 160 (1600/10) statements every 6 (60/10) seconds.
Notice:
A LOGOFF command entered after a BEGIN and before the matching END LOAD logs you off the Teradata TPump utility.
DATABASE
Purpose
Teradata TPump supports the Teradata SQL DATABASE statement, which changes the default database qualification for all unqualified DML and DDL statements.
Syntax
where
Syntax Element |
Description |
|||
database |
New qualified default database for the error table Changes the database from the one originally specified by the BEGIN LOAD command. |
Usage Notes
The DATABASE command only affects native SQL commands. In particular, it has no effect on the BEGIN LOAD command.
The DATABASE command does affect INSERT, UPDATE, DELETE, and EXEC statements issued as part of a load. (When Teradata TPump logs on sessions, it immediately issues a DATABASE statement on each session.)
The DATABASE command does not affect the placement of Teradata TPump macros.
DATEFORM
Purpose
The DATEFORM command lets you define the form of the DATE data type specifications for the Teradata TPump job.
Syntax
where
Syntax Element |
Description |
INTEGERDATE |
Keyword that specifies integer DATE data types for the Teradata TPump job This is the default Teradata DATE data type specification for Teradata TPump jobs if you do not enter a DATEFORM command. |
ANSIDATE |
Keyword that specifies ANSI fixed‑length CHAR(10) DATE data types for the Teradata TPump job |
Usage Notes
Table 27 describes the things to consider when using the DATEFORM command.
Topic |
Usage Notes |
|||
Command Frequency and Placement |
|
|||
Data Type Conversions |
When the ANSIDATE specification is used, the ANSI/SQL DateTime data types must be converted to fixed‑length CHAR data types when specifying the column/field names in the Teradata TPump FIELD command. |
|||
|
See the FIELD command for a description of the fixed‑length CHAR representations for each DATE, TIME, TIMESTAMP, and INTERVAL data type specification. |
|||
Release Applicability |
The ANSIDATE specification is valid for Teradata TPump jobs on Teradata Database for a UNIX OS. |
DELETE
Purpose
Teradata TPump supports the DELETE Teradata SQL statement, which removes rows from a table.
Syntax
where
Syntax Element |
Description |
tname |
Table from which rows are to be deleted tname is qualified either explicitly by database name, or by the current default database. |
WHERE condition |
Conditional clause identifying the row(s) to delete The conditional clause uses values from input data record fields as defined by a FIELD command or TABLE command of the layout referenced by an IMPORT using this statement. |
Usage Notes
The following notes describe how to use DELETE statements following a DML command.
A DELETE statement may also be used in the support environment; normal rules for DELETE are followed in that case.
Teradata TPump operates only on single table statements so DELETE statements must not contain any joins.
To delete records from a table, the username specified on the LOGON command must have DELETE privilege on the specified table.
When the condition(s) of the DELETE statement WHERE clause are evaluated, the result can be definitely true, definitely false, or indeterminate. If the result is true for a specific row, Teradata TPump deletes the row. An indeterminate result, due to an abnormal arithmetic condition such as underflow, overflow, or division by zero, is treated as an error, and Teradata TPump records both row and error code in the error table.
The DELETE statement must identify only one object.
Remember the following when constructing scripts:
The OR construct can be used in the WHERE clause of a DELETE statement; alternatively, two or more separate DML statements (one per OR term) can be used, with the DML statements applied conditionally with the APPLY clause of the IMPORT command. The nature of the alternatives will usually make one of the methods more appropriate.
DML validtime qualifier and NONTEMPORAL semantics are supported. For more information, see SQL Data Manipulation Language (B035‑1146).
Example
The following example uses an input data source containing a series of one‑field, four‑byte records. Each record contains the value (EmpNum) of the primary index column (EmpNo) of a row to be deleted from the Employee table.
.BEGIN LOAD SESSION number;
.LAYOUT Layoutname;
.FIELD EmpNum 1 INTEGER;
.DML LABEL DMLlabelname;
DELETE Employee WHERE EmpNo = :EmpNum;
.IMPORT INFILE Infilename LAYOUT Layoutname APPLY DMLlabelname;
.END LOAD;
DISPLAY
Purpose
The DISPLAY command can be used to write messages to the specified destination.
Syntax
where
Syntax Element |
Description |
'text' |
Text to be written to the specified output destination |
fileid |
Data source of the external system. The external system DD (or similar) statement specifies a file. UNIX and Windows Systems infilename (the path name for a file) If the path name has embedded white space characters, enclose the entire path name in single or double quotes. If the path name is enclosed by single quotation marks and there is an embedded single quotation mark, then an escape character (single quotation mark) needs to precede the embedded single quotation mark. Likewise, if the path name is enclosed by double quotation marks and there is an embedded double quotation mark, then an escape character (double quotation mark) needs to precede the embedded double quotation mark. If the path name contains single or double quotation marks, it should be enclosed in either single or double quotation marks. z/OS a true DDNAME. If DDNAME is specified, Teradata TPump reads data records from the specified source. A DDNAME must obey the same rules for its construction as Teradata SQL column names, except that: A DDNAME must obey the applicable rules of the external system and may reference a sequential or VSAM data set. If DDNAME represents a data source on magnetic tape, the tape may be either labelled or nonlabelled (if the operating system supports it). |
Usage Notes
Utility variables are replaced by their values before text is displayed. This is done by preceding the variable name with an ampersand (&). To display the name of a utility variable, code two “&”s instead of one.
To display an apostrophe within the text string, two consecutive apostrophes (single quotes) must be used to distinguish it from both the single quotes enclosing the string and a regular double quote.
In UNIX systems, if outfilename is used to redirect stdout as well as the file in a DISPLAY command, the results written to outfilename may be incomplete due to conflicting writes to the same file.
On UNIX systems, use an asterisk (*) as the fileid specification to direct the display messages to the system console/standard output (stdout) device. The system console is the:
DML
Purpose
The DML command defines a label and error treatment options for one or more immediately following DML statements. DML statements relevant to a Teradata TPump job are INSERT, UPDATE, DELETE, and EXECUTE, with UPDATE and INSERT statements sometimes paired to form either a basic upsert or an Atomic upsert operation.
Syntax
where
Syntax Element |
Description |
LABEL |
Keyword indicating that the following parameter is a label for the DML statements that follow |
label |
Unique label is to be used for the immediately following set of one or more DML statements A label must obey the same rules for its construction as Teradata SQL column names. The label name may be referenced in the APPLY clause of an IMPORT command. |
MARK |
Keyword indicating that the system should make a duplicate, missing, or extra INSERT, UPDATE, or DELETE row entry in the error table and continue processing. A row is a duplicate row if all column values in the row are the exact duplicate of another row. Duplicate row checking is bypassed if the table is a multiset table (which allows duplicate rows), or if the table has one or more unique indexes (the uniqueness test(s) make any duplicate row check unnecessary). If MARK is set and a uniqueness violation occurs on either a unique primary index or a unique secondary index, the offending rows go to the error table, whether or not the row is a duplicate row. In the case of an upsert, both the INSERT and UPDATE portions must fail for an error to be recorded. |
|
If neither MARK or IGNORE is specified for duplicate rows, MARK applies to both INSERTs and UPDATEs. Similarly, if neither MARK or IGNORE is specified for missing or extra rows, MARK applies to both UPDATEs and DELETEs. |
|
MARK is the default for: |
IGNORE |
keyword indicating that the system should not make an error table entry for the duplicate, missing, or extra INSERT, UPDATE, or DELETE row The system should continue processing instead. A row is a duplicate row if all column values in the row are the exact duplicate of another row. Duplicate row checking is bypassed if the table is a multiset table (which allows duplicate rows), or if the table has one or more unique indexes (the uniqueness test(s) make any duplicate row check unnecessary); in these cases, IGNORE DUPLICATE ROWS has no effect. Any uniqueness violations will result in the offending rows going to the error table. |
|
If neither INSERT nor UPDATE is specified for duplicate rows, IGNORE applies to both INSERTs and UPDATEs. |
|
Similarly, if neither UPDATE nor DELETE is specified for missing or extra rows, IGNORE applies to both UPDATEs and DELETEs. IGNORE is the default condition for an upsert operation. |
INSERT |
The upsert feature may be used (when used as DO INSERT FOR MISSING UPDATE ROWS or DO INSERT ROWS). |
|
An upsert saves time while loading a database. An upsert completes, in one pass, an operation which requires two passes for other utilities. The DML statements that follow this option must be in the order of a single UPDATE statement followed by a single INSERT statement. This option first executes the UPDATE statement. If the UPDATE fails because the target row does not exist, Teradata TPump automatically executes the INSERT statement. This capability allows updates to the database without first presorting the data. Otherwise, the data would have to be sorted into: Further information on the usage and restrictions of the upsert feature appears in the following usage notes. |
PARTITION |
Optional keyword used to name a session partition to be used for all SQL requests associated with this DML command If this keyword is not present, a session created from the SESSIONS will be used. Note: If serialization of two or more DML statements is required, the statements cannot be put in different partitions. Serialization requires that all DML statements with identical hash values of the rows be submitted from the same session. When using multiple DMLs and pack > 1 more throughput can be achieved by using partitions. If DBQL analysis reflects high PE cpu compared to AMP cpu, using partitions can reduce parsing due to better utilization of statement cache by assigning each DML to a unique TPUMP partition so all requests per partition are textually identical thus using the same single statement cache entry. Additional performance benefits can be achieved with concurrent step execution with higher pack and by increasing the number of sessions. |
partition_name |
Parameter identifying the partition name The partition name must obey the same rules for its construction as Teradata SQL column names. |
SERIALIZEON |
Keyword used to turn serialization on for the fields specified SERIALIZEON keyword may be used before, after, or between any IGNORE or MARK statements. |
serialize_on_field |
Parameter identifying the field names where serialization is turned on This is the same field name used in the LAYOUT command which was used by the INSERT statement and referenced by the APPLY clause. Separate the field names with a comma and enclose them in parentheses. |
USE |
Keyword used to specify the fields that are to be used with a DML’s SQL statements Use of this keyword allows specification of the FIELDs from the LAYOUT command which are actually needed for each DML, so that data from all fields will not be sent.[ The USE keyword may be placed before, after, or between any IGNORE/MARK statements. |
use_field |
Parameter identifying the field names to use Every LAYOUT FIELD used by any of the DML’s SQL statements must be enumerated in the USE list; otherwise, an error will occur. Separate the field names with a comma and enclose them in parentheses. |
ArraySupport |
“ArraySupport ON|OFF” option to the .BEGIN LOAD command and the .DML command When “ArraySupport ON” is specified in the .BEGIN LOAD command, the .DML commands enclosed in .BEGIN LOAD and .END LOAD command pair will use the ArraySupport feature for its DML statement, unless “ArraySupport OFF” is specified for the .DML command. The default value of ArraySupport for the .BEGIN LOAD command is OFF. When “ArraySupport ON|OFF” is not specified with the .DML command, the default value for ArraySupport for that .DML command is the effective setting of ArraySupport in the .BEGIN LOAD command where the .DML command resides. When “ArraySupport ON|OFF” is specified at the .DML command, the specified value overrides the default setting determined by the .BEGIN LOAD command. When a .DML command is using the ArraySupport feature, it must contain one and only one DML statement and the session partition that the .DML command references needs to be used exclusively by this .DML command. If the DML statement is an UPSERT‑type statement, it can be specified as a pair of INSERT/UPDATE statements with DO INSERT FOR MISSING UPDATE clause. Teradata TPump will create its equivalent form of UPDATE … ELSE INSERT …, example Atomic Upsert, and use it as the actual DML statement. Or an UPDATE … ELSE INSERT … statement can be directly specified with DO INSERT FOR MISSING UPDATE clause. The non‑atomic form of UPSERT is not supported by Teradata TPump Array Support. |
Usage Notes
The SQL EXECUTE command must be used between the BEGIN LOAD command and the END LOAD command.
All INSERT, UPDATE, DELETE, and EXECUTE statements specified in the Teradata TPump script should fully specify the primary index of the referenced table to prevent the generation of table‑level locks.
A maximum of 1500 DML statements may be packed into a request; the default is 20 statements.
Teradata TPump assumes that row hash locking is used by INSERT, UPDATE, DELETE, and EXECUTE statements. If row hash locking is not used, Teradata TPump will run anyway, but may encounter trouble because table‑level locking will cause each statement to block.
In addition, Teradata TPump does not support UPDATE or EXECUTE statements that modify the primary index of the target table. Teradata TPump performs no checking to prevent the script author from creating DML that requests unreasonable updates, except that Teradata TPump will not use Atomic UPSERT if the UPDATE portion of the UPSERT specifies an unreasonable update. This restriction is imposed by Teradata Database.
IGNORE DUPLICATE ROWS does not apply if there are ANY unique indexes in the row.
Teradata TPump converts INSERT, UPDATE, and DELETE statements into macro equivalents, and, depending on the packing specified, submits multiple statements in one request. Teradata TPump also supports the EXECUTE statement, which can be used to bypass the macro creation step for frequently executed macros. For more information on the EXECUTE statement, refer to EXECUTE in this chapter.
The maximum number of INSERT, UPDATE, and DELETE statements that can be referenced in an IMPORT is 128. The 128th DML which would cause the insertion of the DML sequence number of 128 for the DMLSEQ field in the error table could lead to Teradata Database 3520 error.
At the end of an IMPORT, an environmental variable is established for each DML command executed. Teradata TPump variables are not limited to 30 characters. These variables contain the activity counts associated with each statement. The variables created are of the form:
&IMP <n>_<Apply label>_<x>
where
n = the number of the IMPORT, from one through four.
Apply label = the label of the clause containing the DML command in question.
x = the number of the statement within the containing APPLY clause.
Serialization
The SERIALIZEON keyword allows serialization to be turned on for the specified fields. The SERIALIZEON keyword can be used before, after, or between any IGNORE or MARK statements.
The SERIALIZEON keyword can also be used with the SERIALIZE keyword in the BEGIN LOAD command and with the KEY keyword in the FIELD command. When it is used in this way, the DML‑level serialization ignores and overrides the BEGIN LOAD‑level serialization.
In addition, the DML serialized APPLYs can be mixed with nonserialized DML APPLYs in the same IMPORT command.
See “BEGIN LOAD” and “FIELD” for details about these commands.