Unresponsive Job Detection | Teradata Data Mover - 17.11 - Detecting Unresponsive Jobs - Teradata Data Mover

Teradata® Data Mover User Guide

Product
Teradata Data Mover
Release Number
17.11
Release Date
October 2021
Content Type
User Guide
Publication ID
B035-4101-091K
Language
English (United States)

Data Mover can be configured to detect and stop jobs that become unresponsive after user-defined timeout periods. Configure this feature by using the Job Timeout tab of the Data Mover Setup portlet or in the configuration.xml file generated when the list_configuration command is run.

Variable Timeout Periods

A Data Mover job includes several different phases, and the duration of some phases depends heavily on the size of the data being copied. The larger the size of the data copied, the longer the time required for the initiate phase, the apply rows phase (for Teradata PT API tasks), or the build phase (for DSA tasks). You can use the Data Mover Setup portlet or configuration.xml to specify the size category and standard timeout periods for these variable timeouts for different size categories and phases.

Data Mover performs the following process to determine the actual timeout period to use before stopping an unresponsive job:
  1. Detects the size of the object a job is moving to determine whether the object is small, medium, or large.
  2. Checks the current phase of the job and uses the timeout period for the phase to determine whether the job is taking longer than the timeout period.

Configuration Properties for Unresponsive Jobs

You can set the configuration properties listed below to enable detection of unresponsive jobs and set the timeout period when these jobs are stopped.
Property Description Default Value
hanging.job.check.enabled Enable or disable detection of unresponsive jobs.
If this property (hanging.job.check.enabled) is not set to true, none of the defaults for the other properties listed in this table are applicable.
false
hanging.job.check.rate Frequency with which Data Mover checks for unresponsive jobs (in hours). 1
hanging.job.timeout.acquisition Timeout for the acquisition phase of tasks (in hours). 1
hanging.job.timeout.range.small.max Defines maximum size (in MB) for an object to be considered a small object. 5
hanging.job.timeout.range.large.min Defines minimum size (in GB) for an object to be considered a large object. 10
hanging.job.timeout.small.apply Small objects: Timeout period for application phase (in hours) 2
hanging.job.timeout.small.build Small objects: Timeout period for build phase (in hours) 2
hanging.job.timeout.small.initiate Small objects: Timeout period for initiate phase (in hours) 2
hanging.job.timeout.medium.apply Medium objects: Timeout period for application phase (in hours) 4
hanging.job.timeout.medium.build Medium objects: Timeout period for build phase (in hours) 4
hanging.job.timeout.medium.initiate Medium objects: Timeout period for initiate phase (in hours) 4
hanging.job.timeout.large.apply Large objects: Timeout period for application phase (in hours) 8
hanging.job.timeout.large.build Large objects: Timeout period for build phase (in hours) 8
hanging.job.timeout.large.initiate Large objects: Timeout period for initiate phase (in hours) 8