USING Options - Analytics Database - Teradata Vantage

SQL Data Definition Language Detailed Topics

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
vuk1628111288877.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
jbg1472252759029
lifecycle
latest
Product Category
Teradata Vantage™

The COLLECT STATISTICS (Optimizer Form) USING options apply to various sampling and threshold options that you can use to more finely tune the row sampling you specify or to avoid recollecting statistics when the data has not changed enough according to recollection thresholds you specify. You cannot specify USING options if you also specify the SUMMARY option for a COLLECT STATISTICS request. For more information about sampled and threshold statistics, see Reducing the Cost of Collecting Statistics by Sampling.

The following table describes the USING options.

Option Description
MAXINTERVALS n Maximum number of histogram intervals to be used for the collected statistics. Vantage might adjust the specified maximum number of intervals depending on the maximum histogram size.

This option is valid for tables and constant expressions.

The value for n must be an integer number.

The valid range for n is 0 through 500.

If you do not specify this option, Vantage determines the maximum number of intervals to use for the histogram.

You can only specify this option if you also specify an explicit column or index.

You cannot specify MAXINTERVALS n for a standard recollection of statistics on an implicitly specified index or column set.

If you specify 0 intervals, the request only captures summary statistics such a the number of unique values, the number of nulls, and so on.

The larger the number of intervals, the more optimal the granularity of the statistical data in the histogram. A finer granularity enables better single-table and join selectivity estimates for non-uniform data; however, you should apply this optimal granularity selectively because the larger the number of intervals, the larger the size of the histogram, which can increase query optimization time.

MAXVALUELENGTH n Maximum size for histogram values such as MinValue, ModeValue, or MaxValue. The value for n must be an integer number.
  • For single-column statistics, the valid range of n is 1 through the maximum size of the column.
  • For multicolumn statistics, the valid range of n is 1 through the combined maximum size of all the columns.

This option is valid for both tables and constant expressions.

If you specify a larger size than the maximum size, Vantage automatically adjusts the value to the maximum size.

Vantage might also adjust the maximum value length you specify based on the size of the histogram that contains the statistics.

If you do not specify a MAXVALUELENGTH, Vantage determines the maximum size of the intervals to be used for the histogram.

For single-character statistics on CHARACTER and VARCHAR columns, n specifies the number of characters. For all other options, n specifies number of bytes.

For multicolumn statistics, Vantage concatenates the values and truncates them if necessary to fit into the specified maximum size.

The system does not truncate numeric values for single-column statistics. The system increases the interval size automatically if the specification is not sufficient to accommodate the full value for single-column statistics on numeric columns.

For multicolumn statistics, if the maximum interval size truncates numeric statistical data, Vantage automatically increases the maximum interval size to accommodate the numeric column on the maximum size boundary.

A larger maximum value size causes Vantage to retain the value until the specified maximum is reached, which can enable better single-table and join selectivity estimates for skewed columns. However, you should be selective when increasing the size for the required columns because increasing the maximum value size also increases the size of the histogram, which can increase query optimization time.

You can only specify this option if you also specify an explicit column or index.

You cannot specify MAXVALUELENGTH n for a standard recollection of statistics on an implicitly specified index or column set.

NO SAMPLE Use a full-table scan to collect the specified statistics.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify NO SAMPLE for a standard recollection of statistics on an implicitly specified index or column set.
  • For the first collection of statistics, NO SAMPLE overrides the default.
  • For recollections of statistics, NO SAMPLE overrides the previous SAMPLE options and collects full statistics.
NO THRESHOLD
  • Do not apply any thresholds to the collection of statistics.
  • Remove the existing threshold before collecting the statistics.

You can only specify this option if you also specify an explicit column or index.

You cannot specify NO THRESHOLD for a standard recollection of statistics on an implicitly specified index or column set.
  • For the first collection of statistics, NO THRESHOLD overrides the default settings.
  • For recollections of statistics, NO THRESHOLD overrides any previously specified THRESHOLD options and recollects the statistics without any thresholds.
NO THRESHOLD DAYS
  • Do not apply a DAYS threshold to the collection of statistics.
  • Remove the existing DAYS threshold before collecting the statistics.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify NO THRESHOLD DAYS for a standard recollection of statistics on an implicitly specified index or column set.
  • For the first collection of statistics, NO THRESHOLD DAYS overrides the default setting.
  • For recollection of statistics, NO THRESHOLD DAYS overrides any previous DAYS threshold specification.
NO THRESHOLD PERCENT
  • Do not apply a PERCENT change threshold to the collection of statistics.
  • Remove the existing PERCENT change threshold before collecting the statistics.

You can only specify this option if you also specify an explicit column or index.

You cannot specify NO THRESHOLD PERCENT for a standard recollection of statistics on an implicitly specified index or column set.
  • For the first collection of statistics, NO THRESHOLD PERCENT overrides the default.
  • For recollections of statistics, NO THRESHOLD PERCENT overrides any previous change threshold percent specification.
SAMPLE Scan a system-determined percentage of table rows to collect the specified statistics.

SAMPLE has the same meaning as SYSTEM SAMPLE and is only provided for backward compatibility to enable existing COLLECT STATISTICS scripts that specify the USING SAMPLE option to continue to run.

You should use the SYSTEM SAMPLE option instead of SAMPLE.

SAMPLE n PERCENT Scans the percentage of table rows that you specify rather than scanning all of the rows in the table to collect statistics.

The value for n can be a decimal number or integer from 2 through 100.

Specifying SAMPLE 100 PERCENT is equivalent to collecting full statistics.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify SAMPLE n PERCENT for a standard recollection of statistics on an implicitly specified index or column set.
  • For the first collection of statistics, the specified sample percentage overrides the default.
  • For recollection of statistics, SAMPLE n PERCENT overrides any previous SAMPLE option specifications and instead scans n percent of the rows in the table.
SYSTEM MAXINTERVALS Use the system-determined maximum number of intervals for this histogram.

This option is valid for tables.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify SYSTEM MAXINTERVALS for a standard recollection of statistics on an implicitly specified index or column set.

Use the system-determined maximum number of intervals for the histogram.

SYSTEM MAXVALUELENGTH Use the system-determined maximum column width for histogram values such as MinValue, ModeValue, or MaxValue.

This option is valid for tables.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify SYSTEM MAXVALUELENGTH for a standard recollection of statistics on an implicitly specified index or column set.

SYSTEM SAMPLE Scan a system-determined percentage of table rows to collect statistics. Vantage may collect a sample of 100 percent several times before downgrading the sampling percentage to a lower value.

SYSTEM SAMPLE is the default option if you do not specify the SAMPLE option.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify SYSTEM SAMPLE for a standard recollection of statistics on an implicitly specified index or column set.

The DBS Control field SysSampleOption in the STATISTICS field group contains the default for this option. For more information, see Teradata Vantage™ - Database Utilities, B035-1102.

SYSTEM THRESHOLD Collect statistics only if the percentage of changed data or the age of the current statistics exceeds the currently specified threshold for the statistic.

Vantage determines the appropriate change threshold to skip recollections if the changes are under this threshold. When doing this, the system takes into consideration the changes of update, delete and insert counts from the last collection of statistics, the history of column demographics, column usage, and Optimizer extrapolation techniques to determine the appropriate change threshold.

If statistics are being collected first time, they are not skipped because there is no existing histogram data that can be used to determine the delta data change or age of the statistics.

The change threshold can be different for different columns.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify SYSTEM THRESHOLD for a standard recollection of statistics on an implicitly specified index or column set.

SYSTEM THRESHOLD DAYS Uses a change-based percentage as a threshold for recollecting statistics.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify SYSTEM THRESHOLD DAYS for a standard recollection of statistics on an implicitly specified index or column set.

SYSTEM THRESHOLD PERCENT Only applies to change percentage. This is the default if you do not specify a PERCENT threshold option.

Vantage determines the appropriate change threshold to skip recollections if the percentage is under this threshold. When doing this, the system takes into consideration the percentage of change since the last update, delete, and insert counts from the last collection of statistics, the history of column demographics, column usage, and Optimizer extrapolation techniques to determine the appropriate change percentage threshold.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify SYSTEM THRESHOLD PERCENT for a standard recollection of statistics on an implicitly specified index or column set.

If you are collecting these statistics for the first time, they are not skipped because there is no existing histogram data that can be used to determine the percentage of change for the data.

The DBS Control field SysChangeThresholdOption in the STATISTICS field group contains the default for this option. For more information, see Teradata Vantage™ - Database Utilities, B035-1102.

THRESHOLD n DAYS Do not to recollect statistics if the age of the statistic is less than the number of days specified.

The value for n must be an integer number that represents the number of days.

The valid range for n is 1 - 9999.

You can only specify this option if you also specify an explicit index or column set.

You cannot specify THRESHOLD n DAYS for a standard recollection of statistics on an implicitly specified index or column set.
  • For the first collection of statistics, collecting statistics is not skipped because no current histogram exists to use to determine the age of the current statistics.

    Instead, the specified number of days overrides the default setting.

  • For recollection of statistics, THRESHOLD n DAYS overrides the previous THRESHOLD n DAYS specification and instead applies the specified number of days threshold to the collection.

The DBS Control field DefaultTimeThreshold in the STATISTICS field group contains the default for this option. For more information, see Teradata Vantage™ - Database Utilities, B035-1102.

THRESHOLD n PERCENT Do not to recollect statistics if the percentage of data change since the last collection is less than the specified percentage.

The value for n can be either a decimal number or an integer number.

The valid range of n is 1 - 9999.99.

You can only specify this option if you also specify an explicit column or index.

You cannot specify THRESHOLD n PERCENT for a standard recollection of statistics on an implicitly specified index or column set.
  • For the first collection of statistics, THRESHOLD n PERCENT overrides the default setting.
  • For recollection of statistics, THRESHOLD n PERCENT overrides the previous threshold change percentage specification and instead applies the specified threshold percentage.

The DBS Control field DefaultUserChangeThreshold in the STATISTICS field group contains the default for this option. For more information, see Teradata Vantage™ - Database Utilities, B035-1102.