Attributes in the TPT Script That Use the Stream Operator - Parallel Transporter

Teradata Parallel Transporter Reference

Product
Parallel Transporter
Release Number
16.10
Published
July 2017
Language
English (United States)
Last Update
2018-06-28
dita:mapPath
egk1499705348414.ditamap
dita:ditavalPath
Audience_PDF_include.ditaval
dita:id
B035-2436
lifecycle
previous
Product Category
Teradata Tools and Utilities

For the most part, you may choose how to set the attributes in the Stream operator itself. The key variables are Pack attribute, MaxSessions attribute, MinSessions attribute, ArraySupport attribute (ON or OFF) and SERIALIZE job option (ON or OFF).

  • The Pack attribute – is one of the key performance enablers in the Stream operator, represents the number of rows that each session’s buffer will hold. Its default value is 20. In general, a higher pack factor usually yields better performance than a smaller pack factor does.
  • The MaxSessions and Minsessions attributes – tell the Stream operator the maximum and minimum number of sessions to transfer data to the Teradata Database.
  • The ArraySupport attribute – offers a new data-driven iteration capability that provides an improved way for the Stream operator to iterate a parameterized DML statement for multiple sets of parameter values within a single request.
  • SERIALIZE job option – can be used to ensure the order of application of records and/or to reduce row hash lock contention. The Stream operator will calculate a hash value based on the key(usually the primary index) to determine the session assigned to process each input row. This allows all rows with the same key to be processed in sequence by the same session, which is especially important if rows are distributed among many sessions.

These variables are correlated. Here are some recommendations:

  • A general Stream operator throughput recommendation is “Pack up to maximum and Sessions Up until trouble”. If experiencing AWT, CPU, or hash collision problems, reduce sessions; use SERIALIZE ON, so session reduction should not be too far beyond 8 or 16 from current/recommended session counts.
  • Keep MaxSessions/MinSessions the same to help serialization. Starve sessions with lower Pack factor to slow the row rates. Reduce packs by 15% for tables with medium-to-large row size; reduce packs by 25% for tables with small row size.
  • Records with same primary index hash to the same session. Data “clumps” with fewer sessions can slow throughput
  • Increasing Pack with the same session footprint will increase rows-per-second throughput.
  • With SERIALIZE ON, session count needs to be kept as an odd number.