Working with Workload Exceptions - Teradata Workload Analyzer

Teradata Workload Analyzer User Guide

Product
Teradata Workload Analyzer
Release Number
15.00
Language
English (United States)
Last Update
2018-09-27
dita:id
B035-2514
lifecycle
previous
Product Category
Teradata Tools and Utilities

Working with Workload Exceptions

Setting Run-Time Exception Directives

After you create new WDs, you define exception directives that instruct Teradata WA how to monitor queries, and what to do if a query exceeds exception criteria while it is executing.

An exception directive consists of a set of exception criteria (exception metrics) and a set of exception actions (actions that Teradata WA takes when all of the metrics for a set of exception criteria are exceeded). If a request exceeds all exception metrics (for example, a request exceeds 1000 CPU seconds), it is potentially disqualified from the workload and conforms to the enabled exception actions.

When specifying multiple metrics in one exception criteria set for a WD, they act as a set of AND'ed conditions.

Multiple exception directives can be defined for a WD. When you define multiple exception directives for a WD with different sets of exception criteria but the same set of exception actions, each exception criteria set is treated as alternative (OR'd) conditions. See “Handling Concurrent Multiple Exception Directives for a WD” on page 155 for guidelines Teradata WA follows when multiple exception directives are applicable at the same time for a WD.

It is recommended that you avoid overly complex combinations of exception criteria in an exception directive until you have experience about how your system performs.

Exception directives are operating environment-dependent. You can vary exception directives for different operating environments. In one operating environment, a WD may use several exception directives, while in another operating environment the same WD may not use any exception directives.

Note: You cannot specify exception directives for WDs that use utilities as classification criteria.

Setting Local Exception Directives

You can use Teradata WA to define local exception directives for a WD. A local exception directive applies only to the current WD. A global exception directive applies to several (or all) WDs. For instructions on creating global exception directives, see Teradata Viewpoint User Guide (B035‑2206).

Note: Exception actions are unavailable if you selected a utility type in the Classification tab.

To create a new local exception directive for a WD

1 Select a WD and click the Exception tab. The Exception tab appears.

2 Click New. The Add Local Exception dialog box displays.

Figure 1: Add Local Exception Dialog Box

3 Fill in the fields/controls as follows.

 

Table 3: Add Local Exception Fields/Controls

Field/Control.

Action/Comment.

Exception Name

Enter the name of the new local exception.

Description

Specify an optional description.

Apply to Operating Environment(s)

Apply the local exception to one or more operating environment periods. By default, all operating environments are selected.

Note that you can select Overview in the Exception tab to view the operating environment periods you selected for a local exception.

4 Select OK to close the dialog box and return to the Exception tab.

5 To specify exception criteria for the new exception directive, see “To define exception criteria” on page 149.

6 To specify exception actions for the new exception directive, see “To define exception actions” on page 152.

7 To apply operating environments to your new exception directive, select Apply OpEnv. The Exception Apply dialog box displays with the defined operating environments. For instructions on defining operating environments, see “Adding Initial Workload Periods for PSA Migration” on page 11.

Figure 2: Sample Exception Apply Dialog Box

8 Select the operating environments you want to apply to the exception directive, and select OK to close the dialog box.

9 Select Accept to save your settings, Restore to reverse them

Select Overview to view the operating environments you applied to the exception directive.

To delete exceptions

1 From the Exception tab, select the exception you want to delete, and then click Delete. The Exception Delete dialog box appears showing the selected exception.

2 Click OK.

To define exception criteria

1 Select the exception directive you want to define in the Exceptions tab.

2 Fill in the fields/controls as follows.

 

Table 4: Exception Criteria Fields/Controls 

Field/Control

Action/Comment

Maximum Rows

1 Select the control.

2 Enter the per step maximum rows in a spool file.

IO Count

1 Select the control.

2 Enter the maximum number of disk I/Os performed on behalf of the query.

Spool Size

1 Select the control.

2 Enter the maximum size of a spool file (per step).

3 Choose whether the size is in:

  • Bytes
  • Thousand Bytes
  • Million Bytes
  • Billion Bytes
  • Number of Amps

    1 Select the control.

    2 Enter the number of AMPs that participate in the query.

    Blocked Time

    1 Select the control.

    2 Enter the length of time the query is blocked by another query.

    Elapsed Time

    1 Select the control.

    2 Enter the length of time the query has been running (that is, response time).

    This time is stored in the Teradata Database as centiseconds.

    Sum Over All Nodes

    1 Select the control.

    2 Enter the total amount of CPU time consumed by the query over all nodes.

    Tactical CPU Usage Threshold (per node)

    1 Select the control.

    2 Enter a positive value. Note the following:

  • Specify a value less than the value specified for Sum Over All Nodes.
  • Specify a value less than 3 seconds to optimize performance.
  • Specify Change Workload and another WD mapped to an AG in the same RP as one of the exception actions.
  • For more information on setting this control, see Teradata Viewpoint User Guide (B035‑2206).

    Note the following:

  • This parameter is enabled only if the WD is Tactical in the Workload Attributes tab and there is a positive value for CPU Sum Over All Nodes.
  • All exceptions in the same operating environment with Tactical CPU Usage Threshold (per node) must use the same value and have the same WD in a Change Workload exception action.
  • Qualification CPU Time

    1 Select the control. The Qualification Time box becomes active when a skew control is checked.

    2 Enter the length of time the following exception conditions must persist before the following criteria are satisfied (in CPU seconds):

  • CPU millisec per IO
  • IO Skew
  • CPU Skew
  • IO Skew Percent
  • CPU Skew Percent
  • You must select one of these criteria to activate Qualification Time. Note that if you do not enter a Qualification Time value, Teradata WA uses the Exception Interval.

    Qualification Time must be an integer multiple of the global the Exception Interval and greater than zero. For more information on these options, see Teradata Viewpoint User Guide (B035‑2206).

    IO Skew

    CPU Skew

    1 Select the control.

    2 Enter a value:

  • IO Skew: The maximum difference in disk I/O counts between the busiest AMP and the average of all involved AMPs during the last exception interval.
  • CPU Skew: The maximum difference in CPU consumption between the busiest AMP and the average of all involved AMPs during the last exception interval.
  • A value of zero means there is no skew. A value greater than zero indicates skew that accumulates to a larger and larger value as long as the skew continues, up until the skew exceeds the Qualification Time.

    For more information on these options, see Teradata Viewpoint User Guide (B035‑2206).

    Note the following:

  • The skew must exceed the Qualification Time before Teradata WA performs the action you specify.
  • The skew must persist for a specifiable length of time, in CPU seconds, that is greater than one global Exception Interval, to qualify as an exception.
  • CPU millisec per IO

    1 Select the control.

    2 Enter the maximum ratio of CPU consumption to disk I/O during the last exception interval.

    You can use this control to detect queries that have an unusually high ratio of CPU processing relative to logical I/Os incurred (for example, an accidental unconstrained product join performed on a very large table). Because of their very high CPU usage, these queries can steal CPU resources from other higher priority workloads, impacting the ability of the Priority Scheduler to favor higher priority requests.

    It is recommended that you initially set this control to 5 or greater.

    You must select this control to activate Qualification Time.

    Note the following:

  • The ratio must exceed the Qualification Time before Teradata WA performs the action you specify.
  • The ratio must persist for a specifiable length of time, in CPU seconds, that is greater than one global Exception Interval, to qualify as an exception.
  • IO Skew Percent

    CPU Skew Percent

    1 Select the control.

    2 Enter a value:

  • IO Skew Percent: The maximum percentage difference in disk
    I/O counts between the busiest AMP and the average of all involved AMPs during the last exception interval.
  • CPU Skew Percent: The maximum percentage difference in CPU consumption between the busiest AMP and the average of all involved AMPs during the last exception interval.
  • A value of 0% means there is no skew. A value greater than 0% indicates skew. The larger the percentage, the worse the skew is. The impact of that skew grows exponentially.

    The skew must exceed the Qualification Time before Teradata WA performs the action you specify.

    For more information about these options, see Teradata Viewpoint User Guide (B035‑2206).

    3 Select Accept to accept your changes.

    4 Next, specify the exception actions Teradata WA performs when the exception criteria are exceeded.

    To define exception actions

    ✔ Under Exception Actions, select the appropriate fields/controls. You must specify at least one exception criteria to access these controls.

     

    Table 5: Exception Action Fields/Controls 

    Field/Control

    Action/Comment

    No Exception Monitoring

    Select this control to prevent logging.

    This control temporarily disables the exception directive without deleting it.

    Continue and Log

    Select this control to log the exception and choose another action.

    If you select this option, you can select Change Workload, Raise Alert, and Run Program.

    Change Workload

    1 Select Continue and Log.

    Change Workload is only available with Continue and Log.

    2 Select Change Workload.

    3 Select the WD in the list box to log the exception and move the request to the specified WD.

    If you specify a positive value for Tactical CPU Usage Threshold (per node), you must specify Change Workload and another WD mapped to an AG in the same RP as one of the exception actions.

    All exceptions in the same operating environment with Tactical CPU Usage Threshold (per node) must use the same value and have the same WD in a Change Workload exception action.

    If you specified values for Blocked Time, Elapsed Time, or both, Change Workload is not an option as an exception action.

    When specifying Change Workload as the exception action that is not to be applied to all operating environments, the following warning will display:

    Notice:

    By not applying an Exception with a Change Workload action to all operating Environments, a request may not consistently route to the same final Workload across different Operating Environments. This may lead to misleading or confusing workload accounting.

    You may not specify Change Workload for the default workload (WD-default).

    Raise Alert

    1 Select Continue and Log, Abort and Log, or Abort on Select and Log to access Raise Alert.

    2 Select Raise Alert to log the exception and raise an alert.

    3 Enter the name of the alert configured using Teradata Viewpoint Alert Setup.

    Run Program

    1 Select Continue and Log, Abort and Log, or Abort on Select and Log to access Run Program.

    2 Select Run Program to log the exception and run a program.

    For more information, see Teradata Viewpoint User Guide (B035‑2206).

    Post to Queue Table

    1 Select Continue and Log, Abort and Log, or Abort on Select and Log to access Post to Queue Table.

    2 Select Post to Queue Table to log to the DBC.SystemQtbl table. Note that the Query ID is recorded and can be used to connect information from the TDWMExceptionLog and the DBQLSqlTbl.

    3 [Optional] Enter a comment (maximum 120 characters) in the text box.

    Abort and Log

    Select this control to log the exception and abort the request.

    When you select this option, Raise Alert and Run Program are enabled.

    Abort on Select and Log

    Select this control to log the exception and abort the request if it contains only SELECT statement(s) and the current transaction has not executed any UPDATE, DELETE, or INSERT statements. Otherwise, select Continue and Log.

    When you select this option, Raise Alert and Run Program are enabled.

    Setting Exception Precedence

    After you define two or more local or global exception directives, you can set the precedence. Teradata WA uses the precedence to determine the exception directives that are more important to honor in the event of a conflict.

    Teradata WA typically honors local exception directives before global exception directives.

    Note that Teradata WA does not consider precedence when evaluating exception directives. Teradata WA only considers precedence when performing the actions you specified.

    To set exception precedence

    1 On the Exception tab for a WD, or in the Global Exceptions view, select Precedence. The Exception Precedence dialog box displays.

    The Exception Precedence dialog box lists all exceptions defined for a workload, and indicates their priority in descending order. By default, the exceptions are listed in the order in which they are created.

    2 To change the priority of an exception, select the exception, and then select the up or down arrow to move it to its desired location. Repeat this step for each exception whose priority you want to change.

    3 When finished, select OK to accept the priority listing and close the dialog box.

    Handling Concurrent Multiple Exception Directives for a WD

    Teradata WA follows these guidelines when multiple exception directives (multiple exception criteria/actions) are applicable at the same time for a WD.

    Teradata WA evaluates all exception directives. It is possible that multiple exception directives are exceeded together. If so, Teradata WA performs all the corresponding exception actions that do not conflict.

    A conflict occurs when two exception actions to be performed are either:

  • Abort and change WD or
  • Change to different WDs (for example, change to WD-A and change to WD-B).
  • Teradata WA follows these guidelines to resolve conflicting exception actions when necessary:

  • Local exception actions take precedence over global exception actions.
  • Teradata WA orders local and global exception actions to their defined precedence for resolving situations similar to the following case:
  • if Maximum Rows > 100, Change Workload to WD-M

    if Sum Over All Nodes > 200, Change Workload to WD-N

    If Maximum Rows and Sum Over All Nodes both exceed their limits at the same time, the defined precedence determines to which WD Teradata WA changes.

  • If Teradata WA must perform several exception actions because one or more exception criteria occur simultaneously, Teradata WA always executes all Raise Alert, Run Program, and Post to Queue Table exception actions. Other actions occur as follows:
  • If you did not specify Abort and Log or Abort on Select and Log, and you specified multiple global Change Workload exception actions, the global Change Workload exception action with highest precedence occurs. Teradata WA logs all other Change Workload exception actions as overridden.
  • If you did not specify Abort and Log or Abort on Select and Log, and you specified multiple local Change Workload exception actions, the local Change Workload exception action with highest precedence occurs. Teradata WA logs all other Change Workload exception actions as overridden.
  • If you did not specify Abort and Log or Abort on Select and Log, and you specified multiple global and local Change Workload exception actions, the local Change Workload exception action with highest precedence occurs, since local exception actions take precedence over global exception actions. Teradata WA logs all other Change Workload exception actions as overridden.
  • Aborts take precedence over any Change Workload exception actions. If you specified Abort and Log or Abort on Select and Log, and you specified multiple global and local Change Workload exception actions, Teradata WA aborts the query and logs all Change Workload exception actions as overridden.