Specifying Instances - Parallel Transporter

Teradata Parallel Transporter User Guide

Product
Parallel Transporter
Release Number
15.00
Language
English (United States)
Last Update
2018-09-27
dita:id
B035-2445
lifecycle
previous
Product Category
Teradata Tools and Utilities

Specifying Instances

You can specify the number of instances for an operator in the APPLY TO or SELECT FROM statement in which it is referenced, using the form (operator_name [number of instances]), as shown in the following example:

   APPLY <DML>...TO OPERATOR (UPDATE_OPERATOR [2]...) 

In attempting to determine the right number of instances for your job, note that producer operators tend to use all of the instances specified in the script, while consumers often use fewer instances than the number specified. This difference results from the fact that consumers and producers use instances differently:

  • Producers automatically balance the load across all instances, pumping data into the data stream as fast as they can.
  • By default, consumers will use only as many instances as needed. If one instance can read and process the data in the data stream as quickly as the producers can write it, then the other instances are not used. If the first instance cannot keep up with the producer operators then the second instance is engaged, and so on.
  • The -C command line option overrides the default behavior by informing producer operators and their underlying data streams to ship data blocks to target consumer operators in a cyclical, round-robin manner, providing a more even distribution of data to consumer operators.

    Consider the following when specifying operator instances:

  • If the number of instances is not specified, the default is 1 instance per operator.
  • Experiment. Start by specifying only one or two instances for any given operator.
  • Teradata PT will start as many instances as specified, but it uses only as many as needed.
  • Don't create more instances than needed--instances consume system resources.
  • Read the Teradata PT log file, which displays statistics showing how much data was processed by each instance. Reduce the number of instances if you see under utilized instances of any operators. If all instances are used add more and see if the job runs better.
  • If the number of instances exceeds the number of available sessions, the job aborts. Therefore, when specifying multiple instances make sure the MaxSessions attribute is set to a high enough value that there is at least one session per instance.
  • After the job runs, use the evaluation criteria shown in “Strategies for Balancing Sessions and Instances” on page 85 to help adjust and optimize the number of operator instances.