15.00 - USING Options - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

prodname
Teradata Database
vrm_release
15.00
category
Programming Reference
featnum
B035-1184-015K

USING Options

The COLLECT STATISTICS (Optimizer Form) USING options apply to various sampling and threshold options that you can use to more finely tune the row sampling you specify or to avoid recollecting statistics when the data has not changed enough according to recollection thresholds you specify. You cannot specify USING options if you also specify the SUMMARY option for a COLLECT STATISTICS request. For more information about sampled and threshold statistics, see “Reducing the Cost of Collecting Statistics by Sampling” on page 179.

The following table describes the USING options.

 

              Option

                                                                        Description

MAXINTERVALS n

MAXINTERVALS n specifies the maximum number of histogram intervals to be used for the collected statistics. Teradata Database might adjust the specified maximum number of intervals depending on the maximum histogram size.

This option is valid for tables and constant expressions.

The value for n must be an integer number.

The valid range for n is 0 - 500.

If you do not specify this option, Teradata Database determines the maximum number of intervals to use for the histogram.

You can only specify this option if you also specify an explicit column or index.

You cannot specify MAXINTERVALS n for a standard recollection of statistics on an implicitly specified index or column set.

If you specify 0 intervals, the request only captures summary statistics such a the number of unique values, the number of nulls, and so on.

The larger the number of intervals, the more optimal the granularity of the statistical data in the histogram. A finer granularity enables better single‑table and join selectivity estimates for non‑uniform data; however, you should apply this optimal granularity selectively because the larger the number of intervals, the larger the size of the histogram, which can increase query optimization time.

MAXVALUELENGTH n

MAXVALUELENGTH n specifies the maximum size for histogram values such as MinValue, ModeValue, or MaxValue. The value for n must be an integer number.

  • For single‑column statistics, the valid range of n is 1 - maximum size of the column.
  • For multicolumn statistics, the valid range of n is 1 - combined maximum size of all the columns.
  • This option is valid for both tables and constant expressions.

    If you specify a larger size than the maximum size, Teradata Database automatically adjusts the value to the maximum size.

    Teradata Database might also adjust the maximum value length you specify based on the size of the histogram that contains the statistics.

    If you do not specify a MAXVALUELENGTH, Teradata Database determines the maximum size of the intervals to be used for the histogram.

    For single‑character statistics on CHARACTER and VARCHAR columns, n specifies the number of characters. For all other options, n specifies number of bytes.

    For multicolumn statistics, Teradata Database concatenates the values and truncates them if necessary to fit into the specified maximum size.

    Teradata Database never truncates numeric values for single-column statistics. The system increases the interval size automatically if the specification is not sufficient to accommodate the full value for single-column statistics on numeric columns.

    For multicolumn statistics, if the maximum interval size truncates numeric statistical data, Teradata Database automatically increases the maximum interval size to accommodate the numeric column on the maximum size boundary.

    A larger maximum value size causes Teradata Database to retain the value until the specified maximum is reached, which can enable better single‑table and join selectivity estimates for skewed columns. However, you should be selective when increasing the size for the required columns because increasing the maximum value size also increases the size of the histogram, which can increase query optimization time.

    You can only specify this option if you also specify an explicit column or index.

    You cannot specify MAXVALUELENGTH n for a standard recollection of statistics on an implicitly specified index or column set.

    NO SAMPLE

    Use a full‑table scan to collect the specified statistics.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify NO SAMPLE for a standard recollection of statistics on an implicitly specified index or column set.

  • For the first collection of statistics, NO SAMPLE overrides the default specified by the cost profile constant AutoSampleStats.
  • For recollections of statistics, NO SAMPLE overrides the previous SAMPLE options and collects full statistics.
  • NO THRESHOLD

  • Do not apply any thresholds to the collection of statistics.
  • Remove the existing threshold before collecting the statistics.
  • You can only specify this option if you also specify an explicit column or index.

    You cannot specify NO THRESHOLD for a standard recollection of statistics on an implicitly specified index or column set.

  • For the first collection of statistics, NO THRESHOLD overrides the default settings of your cost profile.
  • For recollections of statistics, NO THRESHOLD overrides any previously specified THRESHOLD options and recollects the statistics without any thresholds.
  • NO THRESHOLD DAYS

  • Do not apply a DAYS threshold to the collection of statistics.
  • Remove the existing DAYS threshold before collecting the statistics.
  • You can only specify this option if you also specify an explicit index or column set.

    You cannot specify NO THRESHOLD DAYS for a standard recollection of statistics on an implicitly specified index or column set.

  • For the first collection of statistics, NO THRESHOLD DAYS overrides the default setting of your cost profile.
  • For recollection of statistics, NO THRESHOLD DAYS overrides any previous DAYS threshold specification.
  • NO THRESHOLD PERCENT

  • Do not apply a PERCENT change threshold to the collection of statistics.
  • Remove the existing PERCENT change threshold before collecting the statistics.
  • You can only specify this option if you also specify an explicit column or index.

    You cannot specify NO THRESHOLD PERCENT for a standard recollection of statistics on an implicitly specified index or column set.

  • For the first collection of statistics, NO THRESHOLD PERCENT overrides the default cost profile.
  • For recollections of statistics, NO THRESHOLD PERCENT overrides any previous change threshold percent specification.
  • SAMPLE

    SAMPLE specifies to scan a system‑determined percentage of table rows to collect the specified statistics.

    SAMPLE has the same meaning as SYSTEM SAMPLE and is only provided for backward compatibility to enable existing COLLECT STATISTICS scripts that specify the USING SAMPLE option to continue to run.

    You should use the SYSTEM SAMPLE option instead of SAMPLE.

    SAMPLE n PERCENT

    Scans the percentage of table rows that you specify rather than scanning all of the rows in the table to collect statistics.

    The value for n can be a decimal number or integer from 2 through 100.

    Specifying SAMPLE 100 PERCENT is equivalent to collecting full statistics.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify SAMPLE n PERCENT for a standard recollection of statistics on an implicitly specified index or column set.

  • For the first collection of statistics, the specified sample percentage overrides the default specified in the setting of the cost profile constant AutoSampleStats.
  • For recollection of statistics, SAMPLE n PERCENT overrides any previous SAMPLE option specifications and instead scans n percent of the rows in the table.
  • SYSTEM MAXINTERVALS

    Use the system‑determined maximum number of intervals for this histogram.

    This option is valid for tables.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify SYSTEM MAXINTERVALS for a standard recollection of statistics on an implicitly specified index or column set.

    Use the system‑determined maximum number of intervals for the histogram.

    SYSTEM MAXVALUELENGTH

    Use the system‑determined maximum column width for histogram values such as MinValue, ModeValue, or MaxValue.

    This option is valid for tables.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify SYSTEM MAXVALUELENGTH for a standard recollection of statistics on an implicitly specified index or column set.

    SYSTEM SAMPLE

    Scan a system‑determined percentage of table rows to collect statistics. Teradata Database may collect a sample of 100 percent several times before downgrading the sampling percentage to a lower value.

    SYSTEM SAMPLE is the default option if you do not specify the SAMPLE option except for cases where its selection as the default is overridden by the setting of the cost profile constant AutoSampleStats.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify SYSTEM SAMPLE for a standard recollection of statistics on an implicitly specified index or column set.

    The DBS Control field SysSampleOption in the STATISTICS group contains the default for this option. The StatsSysSampleOption cost profile constant can override this setting. For more information, see Utilities: Volume 1 (A-K).

    SYSTEM THRESHOLD

    Collect statistics only if the percentage of changed data or the age of the current statistics exceeds the currently specified threshold for the statistic.

    Teradata Database automatically determines the appropriate change threshold to skip recollections if the changes are below this threshold. When doing this, the system takes into consideration the changes of update, delete and insert counts from the last collection of statistics, the history of column demographics, column usage, and Optimizer extrapolation techniques to determine the appropriate change threshold.

    If statistics are being collected first time, they are not skipped because there is no existing histogram data that can be used to determine the delta data change or age of the statistics.

    The change threshold can be different for different columns.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify SYSTEM THRESHOLD for a standard recollection of statistics on an implicitly specified index or column set.

    SYSTEM THRESHOLD DAYS

    Uses a change-based percentage as a threshold for recollecting statistics.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify SYSTEM THRESHOLD DAYS for a standard recollection of statistics on an implicitly specified index or column set.

    SYSTEM THRESHOLD PERCENT

    Only applies to change percentage. This is the default if you do not specify a PERCENT threshold option unless it is overridden by the settings of your cost profile.

    Teradata Database automatically determines the appropriate change threshold to skip recollections if the percentage is below this threshold. When doing this, the system takes into consideration the percentage of change since the last update, delete, and insert counts from the last collection of statistics, the history of column demographics, column usage, and Optimizer extrapolation techniques to determine the appropriate change percentage threshold.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify SYSTEM THRESHOLD PERCENT for a standard recollection of statistics on an implicitly specified index or column set.

    If you are collecting these statistics for the first time, they are not skipped because there is no existing histogram data that can be used to determine the percentage of change for the data.

    The DBS Control field SysChangeThresholdOption in the STATISTICS group contains the default for this option. The StatsSysChangeThresholdOption cost profile constant can override this setting. For more information, see Utilities: Volume 1 (A-K).

    THRESHOLD n DAYS

    THRESHOLD n DAYS specifies not to recollect statistics if the age of the statistic is less than the number of days specified.

    The value for n must be an integer number that represents the number of days.

    The valid range for n is 1 - 9999.

    You can only specify this option if you also specify an explicit index or column set.

    You cannot specify THRESHOLD n DAYS for a standard recollection of statistics on an implicitly specified index or column set.

  • For the first collection of statistics, collecting statistics is not skipped because no current histogram exists to use to determine the age of the current statistics.
  • Instead, the specified number of days overrides the default setting of your cost profile.

  • For recollection of statistics, THRESHOLD n DAYS overrides the previous THRESHOLD n DAYS specification and instead applies the specified number of days threshold to the collection.
  • The DBS Control field DefaultTimeThreshold in the STATISTICS group contains the default for this option. The StatsDefaultTimeThreshold cost profile constant can override this setting. For more information, see Utilities: Volume 1 (A-K).

    THRESHOLD n PERCENT

    THRESHOLD n PERCENT specifies not to recollect statistics if the percentage of data change since the last collection is less than the specified percentage.

    The value for n can be either a decimal number or an integer number.

    The valid range of n is 1 - 9999.99.

    You can only specify this option if you also specify an explicit column or index.

    You cannot specify THRESHOLD n PERCENT for a standard recollection of statistics on an implicitly specified index or column set.

  • For the first collection of statistics, THRESHOLD n PERCENT overrides the default setting of your cost profile.
  • For recollection of statistics, THRESHOLD n PERCENT overrides the previous threshold change percentage specification and instead applies the specified threshold percentage.
  • The DBS Control field DefaultUserChangeThreshold in the STATISTICS group contains the default for this option. The StatsDefaultUserChangeThreshold cost profile constant can override this setting. For more information, see Utilities: Volume 1 (A-K).