15.10 - Example: Using the THRESHOLD Option to Recollect Statistics - Teradata Database

Teradata Database SQL Data Definition Language Syntax and Examples

prodname
Teradata Database
vrm_release
15.10
created_date
December 2015
category
Programming Reference
featnum
B035-1144-151K

The following two requests collect statistics on the o_orderkey and o_orderdatetime columns of the orders table for the first time. Teradata Database collects full statistics for both requests, and the system retains the THRESHOLD options at each level.

The first request specifies thresholds of a 10% change in the data or 7 days having passed since statistics were last collected on orders.o_orderkey. While Teradata Database does not use these criteria for a first time collection of statistics, it does save them in DBC.StatsTbl to consult for future requests to recollect statistics on orders.o_orderky.

For later requests to recollect statistics on orders.o_orderkey, Teradata Database determines whether the percent change in the data exceeds 10% and if more than 7 days have passed since statistics were last collected. If either or both criteria are exceeded, the system recollects the statistics, but if neither criterion is exceeded, the system does not recollect the statistics on orders.o_orderkey.

     COLLECT STATISTICS USING THRESHOLD 10 PERCENT AND THRESHOLD 7 DAYS
     COLUMN o_orderkey
     ON orders;

The second request specifies a system-determined data change threshold. While Teradata Database does not use this criterion for a first time collection of statistics, it does save it in DBC.StatsTbl to consult for future requests to recollect statistics on orders.o_orderdatetime.

For later requests to recollect statistics on orders.o_orderdatetime, Teradata Database determines whether the percent change in the data since statistics were last collected exceeds the system-determined threshold criterion. If it does, the system recollects the statistics, but if it does not, the system does not recollect the statistics on orders.o_orderdatetime.

    COLLECT STATISTICS USING SYSTEM THRESHOLD
     COLUMN o_orderdatetime
     ON orders;

Teradata Database stores the thresholds you specify and uses them for future requests to recollect statistics on the o_orderkey or o_orderdatetime columns of orders.

The following request recollects statistics on the orders table. Based on the THRESHOLD settings that were in place when the statistics were collected, the Optimizer does not recollect statistics on orders.o_orderkey if the table growth is less than 10% or the age of the statistics is less than 7 days. Teradata Database only recollects these statistics if the growth exceeds either 10% or the age is greater than 7 days. Similarly, if the growth of the column o_orderdatetime is less than the system-selected threshold for o_orderdatetime, the Optimizer does not recollect those statistics.

     COLLECT STATISTICS ON orders;

Depending on the settings for the THRESHOLD option in those requests, the following criteria determine whether the Optimizer actually recollects the requested statistics.

  • If the growth in cardinality of orders is less than 10% and if the existing statistics are less than 7 days old, the Optimizer does not recollect those statistics.

    Even though you specified both a growth percentage threshold and a number of days threshold, both criteria need not be met for Teradata Database to recollect statistics on o_orderkey. For example, if the system determines that the growth percentage of the table exceeds its system-determined threshold, it recollects the specified statistics even if the number of days threshold is not exceeded.

  • If the growth in cardinality of orders is less than the system-selected threshold for o_orderdatetime, the Optimizer does not recollect those statistics.

    The thresholds that Teradata Database selects depend on the history of the column update-delete-insert counts from the last statistics collection, and so forth. For details, see SQL Request and Transaction Processing, B035-1142.

The intent of the following request is to recollect statistics on orders.o_orderdatetime if the established thresholds are met. If the growth of the table is less than the system-selected threshold, the Optimizer does not recollect the requested statistics.

     COLLECT STATISTICS 
     COLUMN o_orderdatetime
     ON orders;

The intent of the following request is to recollect statistics on o_orderdatetime with a user-specified growth threshold of 10%. If the table growth is not greater than 10%, the Optimizer does not recollect the requested statistics. Teradata Database records the newly specified THRESHOLD percentage so it can be used for future recollections of statistics.

     COLLECT STATISTICS USING THRESHOLD 10 PERCENT
     COLUMN o_orderdatetime
     ON orders;

The following request recollects statistics on o_orderdatetime with no checking for threshold options. This forces the statistics to be recollected. Because you specify the FOR CURRENT clause, the existing threshold options are not changed, and the specified NO THRESHOLD setting is not remembered for subsequent recollections of statistics on orders.orderdatetime.

     COLLECT STATISTICS USING NO THRESHOLD FOR CURRENT
     COLUMN o_orderdatetime
     ON orders;

The intent of the following request is to recollect statistics on o_orderdatetime. The NO THRESHOLD specification forces Teradata Database to recollect the statistics. The Optimizer remembers the newly specified THRESHOLD option and always recollects statistics in the future.

     COLLECT STATISTICS USING NO THRESHOLD
     COLUMN o_orderdatetime
     ON orders;

The following request recollects the statistics collected in the previous example. The Optimizer skips the recollection on o_orderkey if the number of changes to the table demographics from the last collection time is less than 10% and the age of the statistics is less than 7 days.

In other words, if the number of changes exceeds 10% or if the age of the previously collected statistics is greater than 7 days, Teradata Database recollects the statistics for o_orderkey. Similarly, if changes to the table demographics are less than the system-determined change threshold for o_orderdatetime and o_orderstatus, the Optimizer skips the recollection for those two columns.

     
COLLECT STATISTICS ON orders;

The following request recollects statistics on o_orderdatetime. If the table demographics changes since the last time statistics were collected are less than the system-determined change threshold, the Optimizer does not recollect the requested statistics.

     COLLECT STATISTICS 
     COLUMN o_orderdatetime
     ON orders;

The following request recollects statistics on o_orderdatetime with a user-specified change threshold of 10 percent. If the table demographics changes since the last time statistics were collected are less than 10%, the Optimizer does not recollect the requested statistics.

Teradata Database stores the change THRESHOLD percentage to be used for future recollections.

     COLLECT STATISTICS 
     USING THRESHOLD 10 PERCENT
     COLUMN o_orderdatetime
     ON orders;

The following request recollects statistics on o_orderdatetime without checking for thresholds. In other words, the request forces the statistics to be recollected. Because you specify the FOR CURRENT option, Teradata Database does not change the existing THRESHOLD specifications, and does not store the specified NO THRESHOLD option for later recollections.

     COLLECT STATISTICS 
     USING NO THRESHOLD FOR CURRENT
     COLUMN o_orderdatetime
     ON orders;

The following request is identical to the previous request except that unlike that request, this COLLECT STATISTICS request does not specify the FOR CURRENT option. Because the request does not specify FOR CURRENT, the Optimizer stores this NO THRESHOLD specification and does not skip recollecting statistics for subsequent requests.

     COLLECT STATISTICS 
     USING NO THRESHOLD
     COLUMN o_orderdatetime
     ON orders;