16.20 - Review the Metadata - Parallel Transporter

Teradata® Parallel Transporter User Guide

Product
Parallel Transporter
Release Number
16.20
Published
August 2020
Language
English (United States)
Last Update
2020-08-27
dita:mapPath
uah1527114222342.ditamap
dita:ditavalPath
Audience_PDF_product_tpt_userguide_include.ditaval

Teradata PT provides two types of metadata.

  • TWB_STATUS private log captures job performance metadata
  • TWB_SRCTGT private log captures source and target metadata

TWB_STATUS

TWB_STATUS private log captures job performance data at different stages of the job. Teradata PT also provides a tbuild command option for specifying the interval (in seconds) for collecting performance data. For about all tbuild options Teradata Parallel Transporter Reference (B035-2436).

This information is useful for evaluating the performance of a job in terms of throughput and the cost of exporting and loading of data by each operator. It is also useful for capacity planning by collecting the performance data for a period of time, summarizing the CPU utilization and elapsed time for each job, and then determining the trend of performance for the overall loading and exporting processes for a specific system configuration.

Action

Here are some tips for performance evaluations and tuning:

  • Determine the difference in CPU utilization between the producer and consumer operators. For example, if the CPU utilization of the producer operator is 2 times greater than that of the consumer operator, increasing the number of producer instances by a factor of 2 might improve the throughput of the job.
  • Determine the difference between the CPU utilization and the elapsed time for performing the exporting and loading of data (i.e. the EXECUTE method). If the elapsed time is much higher than the CPU time, this could mean that some bottlenecks might have occurred either on the network, I/O system, or the Teradata Database server.
  • Find out how many rows were sent by the producer operator (or received by the consumer operator) with the above CPU utilization. Dividing the numbers of rows by the CPU seconds spent on processing these rows would give you the number of rows per CPU second.
  • The difference between the “start time” of two successive methods would indicate how long the job spent on a method.
  • Find out how much time being spent on each checkpoint. Note checkpoint takes time and resources to process. Tuning the number of checkpoints to be taken by changing the checkpoint interval is necessary.

TWB_SRCTGT

The source and target data shown in this log is for reference only, and requires no specific usage strategy.