Effects of Interval Checkpointing on Job Performance - Parallel Transporter

Teradata® Parallel Transporter User Guide

Product
Parallel Transporter
Release Number
17.00
Published
August 31, 2020
Language
English (United States)
Last Update
2020-08-27
dita:mapPath
zae1544831938751.ditamap
dita:ditavalPath
tvt1507315030722.ditaval
dita:id
B035-2445
lifecycle
previous
Product Category
Teradata Tools and Utilities

Checkpoints increase Teradata PT job overhead. In terms of resources, each executing operator must do the additional work of writing its internal operating state to the checkpoint file, so that it could be restarted from the information in the checkpoint file. In terms of running time, each executing operator must first finish all in-progress work, take its checkpoint, and then wait (when necessary) until all the other operators have finished taking their checkpoints.

Frequent checkpoints can guarantee that only a limited amount of work would have to be repeated if the job were interrupted and then later restarted, because it shortens the time between an error event and the checkpoint. However, specifying a very short checkpoint interval can significantly increase job running time. Choosing a checkpoint interval is a trade off between the cost in increased job run time and the potential reduction in repeated work if the job must be restarted.

Here is an example of a Teradata PT job that loads 20,000,000 rows with 4 instances each of the producer and consumer operators:

  • Specifying a checkpoint interval of 10 seconds increased the job's running time by 7.3% and its host CPU time by 3.3%.
  • Specifying a checkpoint interval of 5 seconds increased the job's running time by 20% and its host CPU time by 6.6%.

Even though interval checkpointing may have a substantial performance cost, its usefulness during a possible restart make interval checkpointing a Teradata “best practice” recommendation.