Collecting and Recollecting Statistics Is Often Time Consuming - Advanced SQL Engine - Teradata Database

SQL Data Definition Language Detailed Topics

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
imq1591724555718.ditamap
dita:ditavalPath
imq1591724555718.ditaval
dita:id
B035-1184
lifecycle
previous
Product Category
Teradata Vantage™

Because Vantage always does a full-table scan (or an index scan if collecting statistics on an index) when it collects full statistics on a table (except when only PARTITION statistics are collected), the process can take a long time to complete. See Reducing the Cost of Collecting Statistics by Sampling. The exception to this is collecting statistics on the system-derived column PARTITION columns, which is always a fast operation. The duration of the scan is dependent on the size of the table, the number of partitions, the system configuration, and the workload on the system.

Additional resources are consumed by the creation of interval histograms and the computation of the various statistical measures used to summarize the characteristics of a column set or index. For a definition of interval histograms, see Teradata Vantage™ - SQL Request and Transaction Processing, B035-1142.

Columns that are not indexed or that are unique indexes take more time to process than NUSI columns.

Because of the time it can take to collect and recollect statistics, you should consider specifying one or more of the USING options, particularly those that establish recollection thresholds, to establish rules for when Vantage should recollect statistics. For more information about these options, see Comparison of Full and Sampled Statistics and Using the THRESHOLD Options to Collect and Recollect Statistics

Using the THRESHOLD Options to Collect and Recollect Statistics.

You should always collect statistics on newly created, empty tables. This defines the columns, indexes, and partitions for a partitioned table, as well as the synoptic data structures for subsequent collection of statistics and demographics. For a definition of synoptic data structure, see Teradata Vantage™ - SQL Request and Transaction Processing, B035-1142.

You should also recollect statistics on newly created tables as soon as they are populated, and thereafter as needed.

When you perform a HELP STATISTICS request on an empty table, the Unique Value column returns a value of 0.