Rules and Guidelines for COLLECT STATISTICS (Optimizer Form)
The following rules apply to using COLLECT STATISTICS.
A COLLECT STATISTICS request can be submitted in one of three ways.
As a single‑statement request.
As the only statement in a macro.
As the only or last statement in an explicit Teradata session mode transaction bracketed
by BEGIN and END TRANSACTION statements.
Because base global temporary tables do not contain data, the statistics you collect
on them have dummy histograms without intervals and a cardinality of 0 rows.
When you specify TEMPORARY to collect statistics on a materialized global temporary
table for the first time, without specifying COLUMN or INDEX options, the materialized
table inherits its statistics definitions from the base temporary table.
Specifying the TEMPORARY keyword to collect statistics on a global temporary table
materializes the table in the current session, if it is not already materialized.
When you log off from a session or if the system forces a logoff, Teradata Database
automatically drops the statistics from all materialized temporary tables.
The system treats multicolumn statistics the same as index statistics if an index
has been defined on the same column set. For an example, see “Collecting Statistics on Multiple Columns” on page 196.
Teradata Database preserves the ordering of columns when you collect multicolumn statistics.
However, if the base table statistics are collected on an index using the INDEX keyword,
the system uses index column ordering in the ascending order of the field ids.
The maximum number of column and index sets you can recollect on a base table, global
temporary table, hash index, or join index is 512.
This limit can be lower than 512 column or index sets subject to limits on the other
resources such as the number of available spools and plastic or concrete step segment
You cannot collect statistics on columns with a data type of Period, XML, BLOB, CLOB,
or any UDT other than Geospatial.
You can collect statistics on the BEGIN and END expressions of a Period column.
You cannot collect statistics on an entire column with the JSON data type. However,
you can collect statistics on extracted portions of the JSON data type. See Teradata JSON.
You can collect statistics on a base table column defined using a complex expression.
You can also collect statistics on a single‑table join index or hash index column
defined using a complex expression in the respective select or column list. Either
method enables the Optimizer to make accurate single‑table cardinality estimates for
queries that specify complex expressions in their predicates that it can match to
a simple index column that is defined on that expression or a superset of it. For
details, see SQL Request and Transaction Processing.However, COLLECT STATISTICS on base table is preferred as base table's statistics
can be directly inherited by all join indexes.
The following rules apply to collecting statistics on partitioned tables.
You can collect statistics on both the primary index, if the table is primary‑indexed,
and row partitioning column sets.
If you specify to collect statistics using the name of the primary index, then statistics
are collected only on the primary index column set.
You cannot submit a COLLECT STATISTICS request on a table where:
Statistics have been dropped
COLLECT STATISTICS … COLUMN or COLLECT STATISTICS … INDEX clause has not been specified
You cannot recollect statistics on a table on which statistics have been dropped.
You cannot specify a USING SAMPLE clause for standard recollection of statistics on
implicitly specified column and index sets.
When you specify a USING option for first time statistics collection or to reset an
existing option, Teradata Database automatically applies the same options for subsequent
recollections of statistics.
When you specify the FOR CURRENT option, Teradata Database uses the USING options
that you specify only for the current COLLECT STATISTICS request.
Teradata Database does not remember the USING options you specify FOR CURRENT for
To collect multiple statistics, you should group the statistics into a single COLLECT
For first time collections, group the statistics that specify the same USING options
together. This enables the Optimizer to apply global optimizations such as early aggregation
and aggregation rollups to hasten the collection process.
You can group recollections of statistics without regard to the various USING options
specified for individual statistics.
You should use a table‑ level COLLECT STATISTICS request without specifying column
or index references to refresh all the statistics for a given table or constant expression.
You cannot specify the SUMMARY option with column references or USING options.
When you recollect statistics on a column or index, Teradata Database automatically
updates the table‑level demographics.