When a Teradata PT job logs a checkpoint, the producer operator in the currently-executing job step stops putting rows into the output data stream, and the consumer operator processes all the rows in the input data stream. All executing operators write records to the job checkpoint files with the information that would allow them to resume processing with no loss or duplication of data at the point the checkpoint was completed.
Teradata PT automatically creates a start-of-data and an end-of-data checkpoint. In addition, you can use the tbuild command to specify a user-defined checkpoint interval (in seconds).
Handling Data Processed After the Checkpoint
If rows are already in the data streams or loaded when a job fails, the restarting of the job could cause the same rows to be sent again. Here is how the operators handle duplicate rows on restart: