Job Options - Parallel Transporter

Teradata Parallel Transporter Reference

Product
Parallel Transporter
Release Number
15.10
Language
English (United States)
Last Update
2018-10-07
dita:id
B035-2436
lifecycle
previous
Product Category
Teradata Tools and Utilities

SERIALIZE

The Serialize option only applies to the Stream operator. Use the Serialize option when correct sequencing of transactions is required. For example, when a job contains a transaction that inserts a row to open a new account, and another transaction updates the balance for the account, then the sequencing of the transactions is critical.

Using the Serialize option in APPLY statements, the Stream operator ensures that operations for a given row occur in the order they are received from the input stream.

To use this option, associate a sequencing key (usually the primary index) with the target table. Each input data row is hashed based on the key to determine the session assigned to process each input row. This allows all rows with the same key to be processed in sequence by the same session, which is especially important if rows are distributed among many sessions.

When using the Serialize option, only one instance of the Stream operator is allowed. Specifying more than one instance causes the Stream operator to terminate with an error.

SERIALIZE OFF

When the Serialize option is set to OFF, transactions are processed in the order they are encountered, then they are placed in the first available buffer. Buffers are sent to parsing engine (PE) sessions and PEs process the data independently of other PEs. In other words, transactions might occur in any order.

If the Serialize option is not specified, the default is OFF unless the job contains an Upsert operation, which causes Serialize to switch the default to ON.

SERIALIZE ON

If the Serialize option is set to ON, operations on a row occur serially in the order submitted.

The sequencing key of SERIALIZE ON is specified as one or more column names occurring in the input data SCHEMA definition. These SCHEMA columns are collectively referred to as the key. Usually the key is the primary index of the table being updated, but it can be a different column or set of columns. For example:

APPLY 
      ('UPDATE emp SET dept_name = :dept_name
          WHERE empno = :empno;')
          SERIALIZE ON (empno)
TO TARGET_TABLE[1]

This APPLY statement guarantees that all data rows with the same key (empno) are applied to the database in the same order received they are received from the producer operator. In this case, the column empno is the primary index of the Emp table.

Note that SERIALIZE ON is local to a specific DML statement. In the following example, a group DML is specified, but only the first statement uses the Serialize option:

APPLY
      ('UPDATE emp SET dept_num = :dept_num
          WHERE empno = :empno; ')
          SERIALIZE ON (empno)
      ('UPDATE dept SET dept_name = :dept_name
          WHERE deptno = :deptno; ')
TO TARGET_TABLE[1]

Following are some of the advantages to using the Serialize option, and might improve performance:

  • SERIALIZE ON can eliminate the lock delays or potential deadlocks caused by primary index collisions coming from multiple sessions.
  • SERIALIZE ON can also reduce deadlocks when rows with non-unique primary index values are processed.