17.05 - Example: Using the THRESHOLD Option to Recollect Statistics - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL Data Definition Language Syntax and Examples

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
Release Date
January 2021
Content Type
Programming Reference
Publication ID
B035-1144-175K
Language
English (United States)

The following two requests collect statistics on the o_orderkey and o_orderdatetime columns of the orders table for the first time. The database collects full statistics for both requests, and the system retains the THRESHOLD options at each level.

The first request specifies thresholds of a 10% change in the data or 7 days having passed since statistics were last collected on orders.o_orderkey. While the database does not use these criteria for a first time collection of statistics, it does save them in DBC.StatsTbl to consult for future requests to recollect statistics on orders.o_orderky.

For later requests to recollect statistics on orders.o_orderkey, the database determines whether the percent change in the data exceeds 10% and if more than 7 days have passed since statistics were last collected. If either or both criteria are exceeded, the system recollects the statistics, but if neither criterion is exceeded, the system does not recollect the statistics on orders.o_orderkey.

     COLLECT STATISTICS USING THRESHOLD 10 PERCENT AND THRESHOLD 7 DAYS
     COLUMN o_orderkey
     ON orders;

The second request specifies a system-determined data change threshold. While the database does not use this criterion for a first time collection of statistics, it does save it in DBC.StatsTbl to consult for future requests to recollect statistics on orders.o_orderdatetime.

For later requests to recollect statistics on orders.o_orderdatetime,the database determines whether the percent change in the data since statistics were last collected exceeds the system-determined threshold criterion. If it does, the system recollects the statistics, but if it does not, the system does not recollect the statistics on orders.o_orderdatetime.

    COLLECT STATISTICS USING SYSTEM THRESHOLD
     COLUMN o_orderdatetime
     ON orders;

The database stores the thresholds you specify and uses them for future requests to recollect statistics on the o_orderkey or o_orderdatetime columns of orders.

The following request recollects statistics on the orders table. Based on the THRESHOLD settings that were in place when the statistics were collected, the Optimizer does not recollect statistics on orders.o_orderkey if the table growth is less than 10% or the age of the statistics is less than 7 days. The database only recollects these statistics if the growth exceeds either 10% or the age is greater than 7 days. Similarly, if the growth of the column o_orderdatetime is less than the system-selected threshold for o_orderdatetime, the Optimizer does not recollect those statistics.

     COLLECT STATISTICS ON orders;

Depending on the settings for the THRESHOLD option in those requests, the following criteria determine whether the Optimizer actually recollects the requested statistics.

  • If the growth in cardinality of orders is less than 10% and if the existing statistics are less than 7 days old, the Optimizer does not recollect those statistics.

    Even though you specified both a growth percentage threshold and a number of days threshold, both criteria need not be met for the database to recollect statistics on o_orderkey. For example, if the system determines that the growth percentage of the table exceeds its system-determined threshold, it recollects the specified statistics even if the number of days threshold is not exceeded.

  • If the growth in cardinality of orders is less than the system-selected threshold for o_orderdatetime, the Optimizer does not recollect those statistics.

    The thresholds that the database selects depend on the history of the column update-delete-insert counts from the last statistics collection, and so forth. For details, see Teradata Vantage™ - SQL Request and Transaction Processing, B035-1142.

The intent of the following request is to recollect statistics on orders.o_orderdatetime if the established thresholds are met. If the growth of the table is less than the system-selected threshold, the Optimizer does not recollect the requested statistics.

     COLLECT STATISTICS 
     COLUMN o_orderdatetime
     ON orders;

The intent of the following request is to recollect statistics on o_orderdatetime with a user-specified growth threshold of 10%. If the table growth is not greater than 10%, the Optimizer does not recollect the requested statistics. The database records the newly specified THRESHOLD percentage so it can be used for future recollections of statistics.

     COLLECT STATISTICS USING THRESHOLD 10 PERCENT
     COLUMN o_orderdatetime
     ON orders;

The following request recollects statistics on o_orderdatetime with no checking for threshold options. This forces the statistics to be recollected. Because you specify the FOR CURRENT clause, the existing threshold options are not changed, and the specified NO THRESHOLD setting is not remembered for subsequent recollections of statistics on orders.orderdatetime.

     COLLECT STATISTICS USING NO THRESHOLD FOR CURRENT
     COLUMN o_orderdatetime
     ON orders;

The intent of the following request is to recollect statistics on o_orderdatetime. The NO THRESHOLD specification forces the database to recollect the statistics. The Optimizer remembers the newly specified THRESHOLD option and always recollects statistics in the future.

     COLLECT STATISTICS USING NO THRESHOLD
     COLUMN o_orderdatetime
     ON orders;

The following request recollects the statistics collected in the previous example. The Optimizer skips the recollection on o_orderkey if the number of changes to the table demographics from the last collection time is less than 10% and the age of the statistics is less than 7 days.

In other words, if the number of changes exceeds 10% or if the age of the previously collected statistics is greater than 7 days, the database recollects the statistics for o_orderkey. Similarly, if changes to the table demographics are less than the system-determined change threshold for o_orderdatetime and o_orderstatus, the Optimizer skips the recollection for those two columns.

     
COLLECT STATISTICS ON orders;

The following request recollects statistics on o_orderdatetime. If the table demographics changes since the last time statistics were collected are less than the system-determined change threshold, the Optimizer does not recollect the requested statistics.

     COLLECT STATISTICS 
     COLUMN o_orderdatetime
     ON orders;

The following request recollects statistics on o_orderdatetime with a user-specified change threshold of 10 percent. If the table demographics changes since the last time statistics were collected are less than 10%, the Optimizer does not recollect the requested statistics.

The database stores the change THRESHOLD percentage to be used for future recollections.

     COLLECT STATISTICS 
     USING THRESHOLD 10 PERCENT
     COLUMN o_orderdatetime
     ON orders;

The following request recollects statistics on o_orderdatetime without checking for thresholds. In other words, the request forces the statistics to be recollected. Because you specify the FOR CURRENT option, the database does not change the existing THRESHOLD specifications, and does not store the specified NO THRESHOLD option for later recollections.

     COLLECT STATISTICS 
     USING NO THRESHOLD FOR CURRENT
     COLUMN o_orderdatetime
     ON orders;

The following request is identical to the previous request except that unlike that request, this COLLECT STATISTICS request does not specify the FOR CURRENT option. Because the request does not specify FOR CURRENT, the Optimizer stores this NO THRESHOLD specification and does not skip recollecting statistics for subsequent requests.

     COLLECT STATISTICS 
     USING NO THRESHOLD
     COLUMN o_orderdatetime
     ON orders;