16.20 - Controlling Recommended Actions and Overriding Default Settings - Teradata Vantage NewSQL Engine

Teradata Vantage™ Application Programming Reference

prodname
Teradata Database
Teradata Vantage NewSQL Engine
vrm_release
16.20
created_date
March 2019
category
Programming Reference
featnum
B035-1090-162K

You can use the DBAControl open APIs to control the application of recommended actions by the Analyzer. The Analyzer mines metadata from DBQL, object use count, and the Data Definition Language processor to identify a critical subset of missing statistics identified by the Optimizer. If logged query plan data from DBQL is unavailable, physical design metadata (such as indexes, primary key or foreign key relationships, and so on) is used to identify missing statistics.

You can also use DBAControl open APIs to override any of the default settings stored in the TDSTATS database.

There are four sets of DBAControl open APIs:

Functionality

Using DBAControl for Excluding Objects

If you want to ... Use the following SQL interface ...
exclude a database or table from all automated statistics management operations. AddExcludedObject.
reverse a prior exclusion on specified objects RemoveExcludedObject.
show the current list of excluded objects SelectAllExcludedObjects.

Using DBAControl for Statistics Settings

If you want to ... Use the following SQL interface ...
specify histogram size for collections on specified objects UpdateStatHistogramSettings.
specify sampling options for collections on specified objects UpdateStatSampleSetting.
specify threshold limits for refreshing statistics on specified objects UpdateStatThresholdSetting.

Using DBAControl for Prepared Collections

If you want to ... Use the following SQL interface ...
add a specified collection to a prepared list AddPreparedCollect.
remove a specified collection from the list RemovePreparedCollect.
display the list of collections commands in a list SelectPreparedCollects.
change the submission order of an individual collection within the list PrioritizePreparedCollect.

Using DBAControl for Managing Space

If you want to ... Use the following SQL interface ...
prevent the TDSTATS.AnalyzerHistoryTbl table from consuming an excessive amount of space TruncateAnalyzerHistory.
prevent the table from consuming an excessive amount of space TruncateAutomateHistory.
prevent the TDSTATS.CommandsHistoryTbl table from consuming an excessive amount of space TruncateCollectHistory.

Other Advanced Settings

Other advanced settings intended for use by support personnel or advanced users are not controlled by DBA Control APIs. Instead, they are stored as rows within a TDStats table and controlled by directly updating the rows with the SQL UPDATE statement.

Advanced settings are stored in TDStats.DefaultsTbl with the following default values:

Setting Name Default Value Description
Max_Percent_Missing 20 Max% of Analyzer Missing Recommendations to apply.
Min_Number_Missing 10 Min# of Analyzer Missing Recommendations to apply.
Max_Threshold_Skip_Count 2 Maximum number of collection skips for Critical statistics.
Analyze_Logged_Usage_Events Y Flag controlling analysis of logged usage of existing stats - System Default.

'Y' - perform function, 'N' - skip function

Analyze_Logged_Usage_Events_

For_<JobName>

Y Flag controlling analysis of logged usage of existing stats for individual Viewpoint Stats Manager job.
Analyze_Logged_Missing_Events Y Flag controlling analysis of logged missing stats recommendations - System Default.

'Y' - perform function, 'N' - skip function

Analyze_Logged_Missing_Events_

For_<JobName>

Y Flag controlling analysis of logged missing stats recommendations for individual Viewpoint Stats Manager job.
Analyze_Staleness Y Flag controlling analysis of potentially stale statistics - System Default.

'Y' - perform function, 'N' - skip function

Analyze_Staleness_For_

<JobName>

Y Flag controlling analysis of potentially stale statistics for individual Viewpoint Stats Manager job.
Validate_Missing_Stats_

Recommendations

1 Validate Optimizer Missing Stats Recommendations within Analyze Jobs:
  • 0 - Off
  • 1 - On (default)
Validate_Missing_Stats_

NonFatal_Error_Codes

'03523,03524' List of DBS error codes that do not fail Validation after EXPLAIN COLLECT STATISTICS failure. See related setting Validate_Missing_Stats_Recommendations.
Collect_Job_NonRetry_Error_

Check

1 Check if error is one that Collect Jobs should not retry:
  • 0 - Off
  • 1 - On (default)
Collect_Job_NonRetry_Error_

Codes

'02665,02666, 03598, 03807' List of DBS error codes that Collect Jobs should not retry after COLLECT STATISTICS error. See the related setting Collect_Job_NonRetry_Error_Check.
Analyze_QueryLog_TimeLimit 0 Maximum elapsed time in minutes for Analyze Jobs to spend analyzing logged queries. A zero or negative value indicates no limit. Applies to all Analyze Jobs that do not have an individual setting as specified by the related setting Analyze_QueryLog_TimeLimit_For_<JobName>.
Analyze_QueryLog_TimeLimit_

For_<JobName>

0 Maximum elapsed time in minutes for the specified Viewpoint Stats Manager job to spend analyzing logged queries. Replace <JobName> with the actual job name.
Monitor_Analyze_Job_Progress 0 Enable progress monitoring of certain major functions performed by Analyze Jobs, including the analysis of logged queries for missing and used stats and the analysis of existing stats for potential staleness:
  • 0 - Off (default)
  • 1 - On
Analyze_Large_XML_Method 1 Controls how Analyze Jobs reads large XML text from the query log. The default is to use the UDF based method.
  • 1 - UDF concat (default)
  • 2 - Client side concat
Analyze_Parse_Parallel Y Flag controlling use of parallelism for the parsing and processing of logged XML documents during analysis. Applies to all Analyze Jobs that do not have an individual setting as specified by the related setting Analyze_Parse Parallel_For_<JobName>:
  • 'Y' - Use parallelism (default)
  • 'N' - Skip parallelism
Analyze_Parse_Parallel_For_<JobName> Y Flag controlling use of parallelism for the parsing and processing of logged XML documents during analysis for an individual Viewpoint Stats Manager job.
  • 'Y' - Use parallelism (default)
  • 'N' - Skip parallelism

To modify the values in this table use the SQL UPDATE statement.

Example: Updating Advanced Settings

Change the system wide default setting for analyzing logged usage data. This example disables the Stats Manager Job function that analyzes logged data for existing stats usage.

UPDATE TDStats.DefaultsTbl
SET DefaultValue = 'N'
WHERE SettingName = 'Analyze_Logged_Usage_Events';

Example: Changing Setting for Individual Stats Manager Job

Make the same setting change as shown in Example: Updating Advanced Settings, but limit the scope of the change to an individual Stats Manager Job named MyAnalyzeJob.

INSERT INTO TDStats.DefaultsTbl (SettingName,DefaultValue)
  VALUES ('Analyze_Logged_Usage_Events_For_MyAnalyzeJob,'N');

Example: Updating Settings for Analyze Jobs

Set the default time limit for all Analyze Jobs to 24 hours (1440 minutes).

UPDATE TDStats.DefaultsTbl
SET DefaultValue = '1440'
WHERE SettingName = 'Analyze_QueryLog_TimeLimit';

Example: Updating Viewpoint Job Settings for Analyze Jobs

Set the time limit for Viewpoint Job “ANALYZE_EIS_DATABASES” to 3 hours (180 minutes).

INSERT INTO TDStats.DefaultsTbl (SettingName,DefaultValue)
  VALUES ('Analyze_QueryLog_TimeLimit_For_ANALYZE_EIS_DATABASES','180');

Example: Disabling Parallel Processing for Analyze Jobs

To disable parallel processing for all Analyze Jobs, use the UPDATE command:

UPDATE TDStats.DefaultsTbl 
	SET DefaultValue = 'N' 
	WHERE SettingName = 'Analyze_Parse_Parallel';

Recommended Guidelines for Changing Advanced Settings

  • If Viewpoint Stats manager Analyze Jobs take an excessively long time to complete, change the Analyze_Logged_Usage_Events setting to 'N'. Analyzing DBQL STATUSAGE requires extra processing overhead and the data obtained describes already existing statistics. Skipping this particular Analyze function will often improve job performance while still performing the more important function of analyzing missing statistics and recommending new statistics. To limit the scope of the changed setting to an individual job, insert a new row whose SettingName value is of the form Analyze_Logged_Usage_Events_For_<JobName> where <JobName> is replaced with the actual job's name.
  • To increase the number of statistics recommendations reported by Analyze Jobs, increase the values for the settings Max_Percent_Missing and Min_Number_Missing. Analyzer logic is designed to identify and report a critical number of subset recommendations according to criteria that measure and rank their potential benefit. Increase these settings to see more of the candidate recommendations that would not ordinarily qualify.
  • If the number of statistics recommendations reported by Analyze Jobs is overwhelming, decrease the values for the settings Max_Percent_Missing and Min_Number_Missing. If the quantity of Analyzer recommendations is excessive or the quality of many of them is poor, reducing these settings will further limit the recommendations to those candidates with the highest estimated benefits. The quality of a given recommendation is considered poor when its collection does not improve query performance.
  • To ensure that statistics labeled critical by the Analyzer are forcibly recollected in spite of their defined thresholds, decrease the Max_Threshold_Skip_Count setting. Collect Jobs are capable of temporarily overriding the THRESHOLD for a critical statistic if its recollection has been deemed 'starved.' A statistic is first labeled 'critical' by the Analyzer if it is involved in steps with cardinality estimation errors. A critical statistic becomes 'starved' if it has not been recollected in a configurable number of past Collect Job runs.
  • The setting Validate_Missing_Stats_Recommendations should be left on (value of 1) to ensure that statistics recommendations from Analyze Jobs are valid and will not encounter parse-time failures when collected. Similarly, the setting Collect_Job_NonRetry_Error_Check should be left on (value of 1) to ensure that any statistics recommendations that encounter unexpected runtime failures during Collect Job runs are inactivated and not resubmitted in subsequent Collect Job runs. The settings Validate_Missing_Stats_NonFatal_Error_Codes and Collect_Job_NonRetry_Error_Codes can be used to further customize the validation and error checking behavior to define which DBS error codes are considered non-fatal (retryable) or fatal (non-retryable).
  • Analyze_QueryLog_TimeLimit and Analyze_QueryLog_TimeLimit_For_<JobName> impose a time limit on that portion of Analyze Job processing that reads logged query data from the DBQL STATSUSAGE option. This phase of Analyze Jobs is typically the most time consuming and hence the most relevant for imposing a time limit. Additional job time is required to aggregate and rank the logged query data and perform additional analysis functions that do not rely on query log STATSUSAGE data including staleness determination and the deactivation of unused statistics.
  • Analyze Jobs whose elapsed times exceed the configured value in Analyze_QueryLog_TimeLimit or Analyze_QueryLog_TimeLimit_For_<JobName> will terminate successfully with generated missing stats recommendations based on the logged query data read and analyzed up to that point. A special event row representing time limit expiration is inserted into table TDStats.AnalyzerHistoryTbl with column EventType set to 'T' and column EventDescription reporting the number of queries analyzed before time expired along with the Viewpoint job name (up to first 20 characters).
  • To impose a TimeLimit on a specific job only, users should insert a new row into TDStats.DefaultsTbl whose SettingName value is of the form Analyze_QueryLog_TimeLimit_For_<JobName> where substring <JobName> is replaced with the actual job's name as defined in Viewpoint Stats Manager. Additional rows can be inserted for each job that requires a custom time limit.
  • When Monitor_Analyze_Job_Progress is enabled (on), Analyze Jobs will record special events that measure the progress of major analysis phases expressed as a percentage complete. Progress indicator events are inserted as special rows into table TDStats.AnalyzerHistoryTbl with column EventType set to 'P' and column EventDescription reporting the percentage of logged queries analyzed thus far or the percentage of existing stats whose staleness has been determined. Progress percentages and elapsed times are reset at the beginning of each major Analyze function phase. The relevant Viewpoint job name (up to first 20 characters) is included in the EventDescription column value.
  • Analyze_QueryLog_TimeLimit and Analyze_QueryLog_TimeLimit_For_<JobName> can be used independently of Monitor_Analyze_Job_Progress or they can be used with Monitor_Analyze_Job_Progress. Progress can be monitored with or without specifying a time limit.
  • Time limits specified by Analyze_QueryLog_TimeLimit and Analyze_QueryLog_TimeLimit_For_<JobName> should not be set so low as to prevent an Analyze Job from reading a representative subset of queries. Setting the time limit too low can result in Analyze Job recommendations that are not beneficial to large portions of the total query workload. Note that Analyze Jobs read queries from the log in ascending order of their DBQL assigned QueryID which means that older queries are generally read first.
  • To skip a particular Analyze Job function, change the related settings Analyze_Logged_Usage_Events, Analyze_Logged_Missing_Events, or Analyze_Staleness. These settings provide a disabling mechanism for controlling the execution of Analyze Jobs functions and achieving the associated savings in elapsed time.
  • Keeping the Analyze_Parse_Parallel setting at the system default of 'Y' has shown performance improvement in Analyze Jobs, but with an increase in spool usage. The extra spool is needed for the set-based amp-level parallel processing that is key to faster Analyze Jobs. If the increase in spool usage becomes problematic, the recommendations are to make more spool space available to the Analyze Job, or to reconfigure the Viewpoint Analyze Job to operate on a shorter period or portion of query log data. The other option is to change the Analyze_Parse_Parallel setting to 'N' to turn off parallelism.
  • An increase in spool usage may also be seen with an Analyze_Large_XML_Method setting of '1'. Recommendations are to make more spool space available to the Analyze Job, reconfigure the Viewpoint Analyze Job to operate on a shorter period or portion of query log data, or to change the Analyze_Large_XML_Method setting to '2'.
  • An Analyze_Parse_Parallel setting of 'Y' is incompatible with an Analyze_Large_XML_Method setting of '2' and will result in Analyze_Parse_Parallel being treated as if the setting had a value of 'N'.

Reporting Progress Indicator and Timer Events

Analyzer events that report progress and time limit expiration can be retrieved from TDStats table AnalyzerHistoryTbl. The query below demonstrates how this is done for a Viewpoint Analyze Job named Analyze_Personnel_DB.

The output from the first execution of this query reports progress events for an Analyze job that is still in the process of reading query log rows. Output from the second execution of this same query at a later time reports progress events indicating that a time limit expired while reading query log rows and the Analyze Job has now transitioned to a subsequent phase that analyzes existing statistics for staleness. Output from the third execution of this query reports progress events indicating the job has finished analyzing all statistics for staleness.

Report the progress of the most recently submitted Analyze Job named Analyze_Personnel_DB:

SELECT Eventtimestamp (FORMAT 'YYYY-MM-DD hh:mi'), EventDescription
  FROM TDStats.AnalyzerHistoryTbl
  WHERE EventType IN ('T','P') AND EventDescription LIKE '%Job   Analyze_Personnel_DB%' AND
  AnalysisId = (SELECT MAX(AnalysisId) FROM TDStats.AnalyzerHistoryTbl     WHERE EventDescription LIKE '%Job Analyze_Personnel_DB%')
ORDER BY 1 ASC;
*** Query completed. 3 rows found. 2 columns returned.
*** Total elapsed time was 1 second.
EventTimeStamp  EventDescription
2015-04-2100:00  Progress of Job Analyze_Personnel_DBs: 0 of 500000 query log rows analyzed (0 percent) after 1 mins
2015-04-2100:05  Progress of Job Analyze_Personnel_DBs: 5000 of 500000 query log rows analyzed (1 percent) after 5 mins
2015-04-2100:11  Progress of Job Analyze_Personnel_DBs: 10000 of 500000 query log rows analyzed (2 percent) after 11 mins 

Execute the same SQL at a later time to report the updated progress of this same running job.

SELECT Eventtimestamp (FORMAT 'YYYY-MM-DD hh:mi'), EventDescription
  FROM TDStats.AnalyzerHistoryTbl
  WHERE EventType IN ('T','P') AND EventDescription LIKE '%Job   Analyze_Personnel_DB%' AND
  AnalysisId = (SELECT MAX(AnalysisId) FROM TDStats.AnalyzerHistoryTbl     WHERE EventDescription LIKE '%Job Analyze_Personnel_DB%')
ORDER BY 1 ASC;
*** Query completed. 3 rows found. 2 columns returned.
*** Total elapsed time was 1 second.EventTimeStamp  EventDescription
2015-04-2100:00  Progress of Job Analyze_Personnel_DBs: 0 of 500000 query log rows analyzed (0 percent) after 1 mins
2015-04-2100:05  Progress of Job Analyze_Personnel_DBs: 5000 of 500000 query log rows analyzed (1 percent) after 5 mins
2015-04-2100:11  Progress of Job Analyze_Personnel_DBs: 10000 of 500000 query log rows analyzed (2 percent) after 11 mins
…
2015-04-2200:50  Progress of Job Analyze_Personnel_DBs: 110000 of 500000 query log rows analyzed (22 percent) after 110 mins
2015-04-2200:55  Progress of Job Analyze_Personnel_DBs: 115000 of 500000 query log rows analyzed (23 percent) after 115 mins
2015-04-2300:01  Time Expired for Job Analyze_Personnel_DBs: 120000 of 500000 query log rows analyzed with TimeLimit of 120 mins
2015-04-2300:02  Progress of Job Analyze_Personnel_DBs: 0 of 1000 stats analyzed for staleness (0 percent) after 1 mins
2015-04-2300:02  Progress of Job Analyze_Personnel_DBs: 10 of 1000 stats analyzed for staleness (1 percent) after 1 mins
2015-04-2300:02  Progress of Job Analyze_Personnel_DBs: 20 of 1000 stats analyzed for staleness (2 percent) after 1 mins

Execute the same SQL at yet a later time to report the updated progress of this same running job.

SELECT Eventtimestamp (FORMAT 'YYYY-MM-DD hh:mi'), EventDescription
  FROM TDStats.AnalyzerHistoryTbl
  WHERE EventType IN ('T','P') AND EventDescription LIKE '%Job   Analyze_Personnel_DB%' AND
  AnalysisId = (SELECT MAX(AnalysisId) FROM TDStats.AnalyzerHistoryTbl     WHERE EventDescription LIKE '%Job Analyze_Personnel_DB%')
ORDER BY 1 ASC;
*** Query completed. 3 rows found. 2 columns returned.
*** Total elapsed time was 1 second.
EventTimeStamp  EventDescription
2015-04-2100:00  Progress of Job Analyze_Personnel_DBs: 0 of 500000 query log rows analyzed (0 percent) after 1 mins
2015-04-2100:05  Progress of Job Analyze_Personnel_DBs: 5000 of 500000 query log rows analyzed (1 percent) after 5 mins
2015-04-2100:11  Progress of Job Analyze_Personnel_DBs: 10000 of 500000 query log rows analyzed (2 percent) after 11 mins
…
2015-04-2200:50  Progress of Job Analyze_Personnel_DBs: 110000 of 500000 query log rows analyzed (22 percent) after 110 mins
2015-04-2200:55  Progress of Job Analyze_Personnel_DBs: 115000 of 500000 query log rows analyzed (23 percent) after 115 mins
2015-04-2300:01  Time Expired for Job Analyze_Personnel_DBs: 120000 of 500000 query log rows analyzed with TimeLimit of 120 mins
2015-04-2300:02  Progress of Job Analyze_Personnel_DBs: 0 of 1000 stats analyzed for staleness (0 percent) after 1 mins
2015-04-2300:02  Progress of Job Analyze_Personnel_DBs: 10 of 1000 stats analyzed for staleness (1 percent) after 1 mins
2015-04-2300:02  Progress of Job Analyze_Personnel_DBs: 20 of 1000 stats analyzed for staleness (2 percent) after 1 mins
…
2015-04-2300:12  Progress of Job Analyze_Personnel_DBs: 980 of 1000 stats analyzed for staleness (100 percent) after 11 mins
2015-04-2300:13  Progress of Job Analyze_Personnel_DBs: 1000 of 1000 stats analyzed for staleness (100 percent) after 12 mins