The Basic Upsert Feature
When using the basic upsert feature:
By following these rules, a number of uses for the DO INSERT ROWS option can be found. In the past, data could be presorted into INSERTs and UPDATEs, or UPDATEs attempted with all the data, and then do an INSERT on any UPDATEs that failed. With upsert, Teradata TPump needs only one pass of the data to UPDATE rows that need to be updated and INSERT rows that need to be inserted.
Note: To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an explicit value if there are upserts in the Teradata TPump job.
When MARK MISSING UPDATE ROWS specified, while using DO INSERT ROWS, Teradata TPump records any UPDATE that fails. This record appears in the Application Error Table, together with an error code that shows that the INSERT of the DO INSERT ROWS was then executed. If the INSERT fails, the INSERT row is also recorded in the Application Error table. The default for an upsert function, however, is not to mark missing update rows. This is because when the upsert function is performed, the INSERT is expected to occur when the UPDATE fails. The failure of the UPDATE portion of an upsert does not, in itself, constitute an error and should not be treated as one.
The MARK MISSING DELETE ROW option has no meaning when used with the DO INSERT ROWS option.
The option of MARK (IGNORE) EXTRA DELETE (UPDATE) ROWS provides Teradata TPump with a way to protect against an update or delete affecting multiple rows, which can happen in Teradata TPump because the primary index can be non‑unique.
MARK is the default for all DML options, except for an upsert.
Each record in the following example contains the value of the primary index column (EmpNo) of a row of the Employee table whose PhoneNo column is to be assigned a new phone number from field Fone.
When the UPDATE fails, the INSERT statement is activated and Teradata TPump enters the upsert mode. In this case, each record contains the primary index value (EmpNum) of a row that is to be inserted successively into the Employee table whose columns are EmpNo and PhoneNo.
.BEGINLOAD SESSION number;
.FIELD EmpNum 1 INTEGER;
.FIELD Fone * (CHAR (10));
.DML LABEL DMLlabelname
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE Employee SET PhoneNo = :Fone WHERE EmpNo = :EmpNum;
INSERT Employee (EmpNo, PhoneNo) VALUES (:EmpNum, :Fone);
.IMPORT INFILE Infilename LAYOUT Layoutname APPLY DMLlabelname;
The scope of a DML command (and its label) terminates at the first following command of any kind or at the end of the file containing the DML statements, whichever occurs first.
The SQL EXECUTE command must be between the BEGIN LOAD command and END LOAD command.
For IMPORT tasks, up to seven distinct error treatment options for one DML command may be specified. For example:
.DML LABEL COMPLEX
IGNORE DUPLICATE INSERT ROWS
MARK DUPLICATE UPDATE ROWS
IGNORE MISSING UPDATE ROWS
MARK MISSING DELETE ROWS
DO INSERT FOR MISSING UPDATE ROWS;
It is valid to specify that missing update rows be both marked and treated as INSERTs or, as in the example, both ignored and treated as INSERTs.
If Teradata TPump encounters any of the following:
it writes a diagnostic message to the primary output destination for the system, terminates the Teradata TPump task, and returns to the main Teradata TPump control module with a conventional nonzero return code. The error can be corrected and resubmitted to the Teradata TPump task.
The DML commands (with their following DML statements) must appear between the appropriate BEGIN LOAD command and the IMPORT commands that refer to them. When the END LOAD command is encountered, the sequence of DML commands and DML statements is forgotten. The DML command cannot be shared by multiple BEGIN LOAD statements. The DML statements are described in the following sections.
The maximum number of DML commands that can be used in a single Teradata TPump task is 128. If an excessive number of DML commands and statements are sent to Teradata Database, an error message is logged, stating that there are too many DML steps for one Teradata TPump job.