Utilize Serialization with the Stream Driver - Parallel Transporter

Teradata® Parallel Transporter Application Programming Interface Programmer Guide

Product
Parallel Transporter
Release Number
17.10
Published
June 2021
Language
English (United States)
Last Update
2021-07-01
dita:mapPath
ang1608578408836.ditamap
dita:ditavalPath
obe1474387269547.ditaval
dita:id
B035-2516
lifecycle
previous
Product Category
Teradata Tools and Utilities

In certain uses of the Stream driver, it is possible to have multiple changes to one row in the same job. For instance, the row may be inserted and then updated during the job or it may be updated and then deleted. In any case, the correct ordering of these operations is very important. By using the serialization feature, the Stream driver can guarantee that this ordering of operations is maintained correctly.

The serialization feature works by hashing each data record based upon a set of columns to determine which session transmits the record to the database. Thus there is extra overhead in the application derived from the mathematical operation of hashing and from the extra amount of buffering necessary to save data rows when a request is already pending on the session chosen for transmission.

The serialization feature greatly reduces the potential frequency of database deadlock. Deadlocks can occur when requests for the application happen to effect rows that use the same hash code within the database. Although deadlocks are handled by the database and by Teradata PT correctly, the resolution process is time consuming and adds additional overhead to the application because it must re-execute requests which roll back due to deadlock.

In Teradata PT, serialization is enabled through the DMLGroup object. The DMLGroup object has an AddSerializeOn function which takes two arguments: the number of columns in the column set and a list of column names terminated by a NULL value.

dmlGr->AddSerializeOn(2, "Associate_Id", "Associate_Name", NULL);

The intent of the column set parameter is to allow the user to specify the columns corresponding to the primary index of the target table. If there is more than one target table specified in the DML statements of the DMLGroup, then it is up to the user to make sure the primary indexes of all the tables match when using the serialize feature.

Each DMLGroup object has a single list of columns to use for serialization. Columns can be added to this list in one call or through a series of multiple calls to the AddSerializeOn function. For example, two columns can be added to a serialization list one at a time using separate calls to the AddSerializeOn function:

dmlGr->AddSerializeOn(1, “Associate_Id”, NULL);
dmlGr->AddSerializeOn(1, “Associate_Name, NULL);

Or the two columns can be added in the same call:

dmlGr->AddSerializeOn(2, “Associate_Id”, “Associate_Name”, NULL);