Product
Release Number
15.00
Language
English (United States)
Last Update
2018-09-27
Product Category

### Analyzing Workloads Based On “What” Parameters

Teradata uses equal-width and equal-height histograms in analyzing what parameters.

#### Equal-Width Versus Equal-Height Histograms

A histogram is a vertical bar chart in which the frequency corresponding to a class is represented by the area of a bar (or rectangle) whose base is the class width. The histogram differs from a bar chart in that it is the area of the bar that denotes the value, not the height. However, if the widths of the bars are uniform (that is, equal-width) then only the height need be considered.

Teradata WA uses both equal-width and equal-height histograms in analyzing the what parameter.

### Equal-Width Histograms

An equal-width histogram such as that shown in Figure 5, divides data into a fixed number of equal-width ranges. The corresponding height of each range represents the number of values falling into that range.

Figure 5: Equal-widths histogram

For example, suppose that the values in a single column of a 1000-row table range between 1 and 100, and you want to generate a 10-bucket equal-width histogram. (Ranges in histograms are often referred to as “buckets.”) The buckets would contain the values 1-10, 11-20, 21-30, and so on, where each bucket counts the number of rows falling into its range. For a list of supported analysis parameters, see Table 3 on page 49.

### Equal-Height Histograms

Equal-width histograms work well when the variation of the data distribution is small. They do not work so well, however, when such variation is large. For example, in Figure 5, 95% of the data falls into the first bucket, and the remaining 1% is scattered into 19 buckets, making it difficult to effectively analyze the data.

In this situation, an equal-height histogram is the solution, such as that shown in Figure 6. Such histograms work well when the variation in data distribution is large. Unlike equal-width histograms, they place the same number of values into each range, so the endpoints of each range are determined by the number of values it contains.

Figure 6: Equal-height histogram

The graph isn't as informative as the bin-widths identified on its x-axis. They reveal that not only are the vast majority of data points in the 0-8.41 range, but provide additional insight that the vast majority of queries (80%) consume less than 0.01 CPU secs.

This information is provided in a pop-up dialog box for each equal-height histogram. For instructions on viewing histograms and their data, see step 6 on page 62

## To display a “what” parameter histogram

1 From the Candidate Workloads tree, expand the workload to be analyzed.

3 Select the appropriate “what” options in the Workload Distribution Parameters list.

4 Click Perform Analysis to analyze the selected workload with chosen options.

By default, the Analyze Workload window creates an equal-width balanced histogram, dividing the total selected distribution parameter into the number of buckets specified in the Top N Value field. Each bucket contains equal amounts of the selected distribution parameter and the total query count lies in that bucket.

5 To view the corresponding equal-height histogram, click the Switch to Equal-Height Histogram button on the Analyze Workload tab.

You can also right-click the histogram itself and select Equal-Height Histogram from the menu that appears.

6 To view the Equal-Height Distribution report associated with the histogram, select View Equal-Height Table.

The table appears similar to the following.

#### Creating a New Workload From a Particular Bucket In the Histogram

For this procedure, Estimated Processing Time is used as an example.

## To create a new workload from a particular bucket in the histogram

1 From the Estimated Processing Time histogram, highlight the bucket to be analyzed, then right-click.

2 Select the Split to New Workload option.

The Estimated Processing Time dialog box appears.

3 Enter the minimum and maximum processing time bucket parameters.

4 Click OK.

5 Enter the workload attribute details. See the table after step 4 of “To group unassigned requests into a workload” on page 44 for more information.

6 Click OK.

A new workload is created with Minimum and Maximum Estimated Processing Time along with the first level of classifications. The selected row/bucket is removed from the Analyze Workload report.

## To zoom in on a bucket for more detail

For this procedure, Estimated Processing Time is used as an example. Zooming in on a bucket creates another Top N Values within the bucket, presenting a finer view of the distribution.

1 From the Estimated Processing Time histogram, right-click the bucket to zoom in on.

2 Select Zoom In from the short-cut menu.

The Zoom In dialog box appears.

3 Enter the minimum, maximum processing time, and the number of bucket parameters, and Click OK.

After processing, the bucket is displayed as a new histogram.

4 To reset the zoom to the previous level, right-click and select the Zoom Out option.

#### Rounding Up Equal-Width Buckets for More Intuitive Number Recognition

The widths of equal-width buckets are calculated by dividing the highest value by the number of buckets. While technically accurate, this method may not be intuitive to the user. Rounding up the calculation of the widths used in equal-width buckets may improve the usability of the buckets while making a small sacrifice in their accuracy.

The rounding up of bucket widths is based on calculating the maximum value and then rounding it up so that the last bucket covers the maximum value. For example, if you have a large range of numbers with the highest value = 3876.77, and want to split the values into 10 buckets, then the bucket size is calculated as 3876.77 / 10 = 387.677. If this is rounded up to 400, then the revised buckets would have the following values: 0- 400, 401-800, 801-1200, ..., 3201-3600, and 3601-4000. These new buckets would then be reflected in the distribution report shown in Figure 7.

Figure 7: Distribution report after rounding up the widths of buckets for data with a large range of values

For data with a small range of values where the highest value = 3.15, split into 10 buckets, bucket width is calculated as 3.15 / 10 = 0.31, rounded to 0.40. The revised buckets are then split into these values: 0-0.40, 0.41-0.60, 0.61-1.0, ..., 3.21-3.6, and 3.61 - 4.00 as reflected in the distribution report in Figure 8.

Figure 8: Distribution report after rounding up the widths of buckets for data with a small range of values

## To display the columns in the “what” classification Analyze Workload report

✔ Select Distribution Parameter.

Table 5 describes the displayed columns in the “what” classification Analyze Workload report.

 Column Name Description Estimated Processing Time Min, Max, Query Count The minimum and maximum estimated processing time and the number of queries that completed during this collection interval for this bucket Percent of Total CPU Percentage of the total CPU time (in seconds) used on all AMPs for this bucket Percent of Total I/O Percentage of the total number of logical input/output (reads and writes) issued across all AMPs for this bucket Average Est Processing Time The average estimated processing time for each query Normalized CPU per Query (Seconds) Min, Avg, StDev, 95th Percentile, Max The minimum, average, maximum, standard deviation, 95th percentile and maximum expected CPU time for queries in this bucket. UnNormalized CPU per Query (Seconds) Min, Avg, StDev, 95th Percentile, Max The minimum, average, maximum, standard deviation, 95th percentile and maximum expected UnNormalized CPU time for queries in this bucket. Response Time (Seconds) Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum response time for queries in this bucket Result Row Count Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum result rows returned for this bucket Disk I/O Per Query Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum disk I/O’s per query for this bucket CPU To Disk Ratio Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum CPU/Disk ratio for this bucket Active AMPS Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum number of active AMPs for this bucket Spool Usage (Bytes) Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum spool usage across all VProcs for this bucket CPU Skew (Percent) Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum AMP CPU skew for this bucket I/O Skew (Percent) Min, Avg, StDev, Max The minimum, average, standard deviation, and maximum of AMP I/O skew for this bucket

1 From the Analyze Workload window displaying the previously analyzed workload, click the Distribution Parameter option in the View By box. The Distribution Report displays with the selected distribution parameter by correlation parameter.

2 Highlight the bucket to be analyzed, then right-click.

The shortcut menu appears. See the following table for descriptions.

3 Select Split to New Workload with (selected Distribution Parameter) as Classification. The Add Workload dialog box appears. See table to complete the options.

4 Click OK. A new workload is created from the selected workload. The new workload contains the parent workload’s classifications, along with the new criteria.

#### Analyzing Workloads Based on Exceptions

Teradata WA supports the following exception parameters:

• Unnormalized CPU Time
• Normalized CPU Time
• IO Count
• CPU Time
• Spool Size
• Response Time

## To analyze a workload based on exception parameters

Note: Alternatively, expand the node next to the workload to be analyzed under the Candidate Workloads tree, and then select Analyze.

3 Click on the Distribution Parameter drop-down, and select the exceptions and actions.

4 Enter the number of Distribution Buckets. The default number of buckets is 10.

5 Click Perform Analysis to analyze the selected workload with the chosen exception parameters.

A same-width histogram displays for the selected workload and correlation parameter.

## To create a workload with exception from the analyzed values

1 From the Analyze Workload window with the previously analyzed workload with exceptions, click the Distribution Parameter option in the View By box.

2 Right-click over the bucket row to analyzed. The Workloads Report shortcut menu for the Distribution Parameter view appears. See the following table for descriptions.

3 Select Split to New Workload with (selected Distribution Parameter) as Exception. The Add New workload With Exception dialog box displays.

4 Select the options as described in the following table:

5 Click OK. The newly created workload with exception appears in the Candidate Workloads tree.

1 From the Analyze Workload window with the previously analyzed workload with exceptions, click the Distribution Parameter option in the View By box.

2 Right-click over the bucket row to analyzed. The Workloads Report shortcut menu for the Distribution Parameter view appears.

3 Select Add (selected Distribution Parameter) as Exception to workload. The Add Exception to Existing WD appears.

4 Select the options as described in the following table: