Effects of Interval Checkpointing on Job Performance
Checkpoints increase Teradata PT job overhead. In terms of resources, each executing operator must do the additional work of writing its internal operating state to the checkpoint file, so that it could be restarted from the information in the checkpoint file. In terms of running time, each executing operator must first finish all in-progress work, take its checkpoint, and then wait (when necessary) until all the other operators have finished taking their checkpoints.
Frequent checkpoints can guarantee that only a limited amount of work would have to be repeated if the job were interrupted and then later restarted, because it shortens the time between an error event and the checkpoint. However, specifying a very short checkpoint interval can significantly increase job running time. Choosing a checkpoint interval is a trade off between the cost in increased job run time and the potential reduction in repeated work if the job must be restarted.
Here is an example of a Teradata PT job that loads 20,000,000 rows with 4 instances each of the producer and consumer operators:
Even though interval checkpointing may have a substantial performance cost, its usefulness during a possible restart make interval checkpointing a Teradata “best practice” recommendation.