Rules and Guidelines for the COLLECT STATISTICS (Optimizer Form) FROM source_table Clause - Advanced SQL Engine - Teradata Database

SQL Data Definition Language Detailed Topics

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
imq1591724555718.ditamap
dita:ditavalPath
imq1591724555718.ditaval
dita:id
B035-1184
lifecycle
previous
Product Category
Teradata Vantage™
Specifying a FROM clause with a COLLECT STATISTICS request provides the same results as specifying a CREATE TABLE … AS … WITH DATA AND STATISTICS with the following exceptions.
  • You must have the SELECT privilege on the source table to copy its statistics to a target table.
  • COLLECT STATISTICS FROM does not copy data. It copies only the statistics for the specified column sets.
  • You cannot specify a USING clause for a COLLECT STATISTICS request used to copy statistics.

    Vantage instead copies the USING options for the most recent COLLECT STATISTICS request from the source to the target.

  • Use the SUMMARY option to copy table-level demographics from a source table to a target table. When you copy statistics without specifying the column list for the source and target tables, Vantage implicitly copies the SUMMARY statistics from the source to the target. If the system copies the SUMMARY statistics implicitly, you should recollect those statistics natively when you later refresh them.

    Unless you have a specific reason for copying SUMMARY statistics from a source table to a target table, you should not copy them.

  • You can use a COLLECT STATISTICS request after you have created an identical target table using either a basic CREATE TABLE request or using a CREATE TABLE … AS … WITH DATA request.

    When you copy PARTITION statistics, the statistics copied to the target table might not correctly represent the data in the target table because of differences in internal partition number mapping between the source and target tables. This is true even if the table definitions returned by a SHOW TABLE request are identical and the data is the same in both tables.

    If you use a CREATE TABLE … AS … WITH DATA AND STATISTICS request to create a target table, the PARTITION statistics you copy from the source table are not valid if the internal partition numbers in the target table are different than the source table.

    A target table created using a CREATE TABLE … AS … WITH DATA AND STATISTICS request may not be identical to the source table from which its statistics are copied down to the level of internal partition numbers., even though the two tables might appear to be identical from comparing their definitions using the output of SHOW TABLE requests on the two.

  • All data in the columns whose statistics are copied using COLLECT STATISTICS must be identical in the source and target tables.

    If they are not identical, you are strongly advised to recollect the statistics for the target table.

  • You do not obtain the same results from a COLLECT STATISTICS … FROM source_table request and a CREATE TABLE … AS … WITH DATA AND STATISTICS request if the internal partition numbers are different. While Vantage copies the same statistics using both methods, in the COLLECT STATISTICS … FROM source_table case, there are cases where the statistics are not valid for the data even though data is the same in both tables.

As a general rule, you should always recollect the PARTITION statistics for the target table when you copy them from a source table.

The following rules apply to copying statistics from a source table to an identical target table using a COLLECT STATISTICS request.
  • When you specify column lists for both the source and target tables, Vantage copies the source table statistics to the target table only if the attributes for the columns in the specified target table are identical to the attributes for the columns in the specified source table. If they are not, the system does not copy the statistics and returns an error to the requestor.

    The source table and target table column lists do not need to be identical.

  • When you do not specify a column list for either the source table or for the target table, Vantage copies statistics from the source table to the target table only if the attributes for each column in the target table are the same as the attributes for the corresponding columns in the source table. Otherwise, Vantage does not copy the statistics and returns an error to the requestor.

    For example, the attributes of all four columns of the target table must be identical to those of the four columns of the source table.

    This rule applies to all column attributes: the data type, NULL/NOT NULL definition, uniqueness, case specificity, uppercase definition, and so forth, must match.

  • When you specify a column list for the source table, but not for the target table, then both tables must have matching columns.

    In this case, the system copies the statistics from the specified source table into the corresponding column set in the target table.

  • When you specify a column list for the target table, but not for the source table, then the column attributes for both tables must be identical.

    In this case, Vantage copies the statistics from the corresponding source columns into the columns specified in the target table.

  • After checking the general eligibility rules, Vantage retrieves all source table single-column statistics.
    The following additional rules apply to copying single-column statistics.
    • If the appropriate single-column statistics have been collected for a column in the source table, then they are copied to the corresponding column of the target table.
    • If the appropriate single-column statistics have not been collected for a column in the source table, statistics are not copied.
    • If the appropriate single-column statistics have already been collected for both the source table and the target table, Vantage writes over the target table statistics with the statistics previously collected for the source table.
  • After checking the general eligibility rules, Vantage uses an algorithm to determine which multicolumn statistics are eligible to be copied to the target table. The system copies statistics collected on all multicolumn sets or indexes in the source table to the corresponding multicolumn sets or multicolumn index sets of the target table.
    The following additional rules apply to copying multicolumn statistics.
    • If the appropriate multicolumn or multicolumn index statistics have been collected for the source table, but not for the target table, then Vantage copies those statistics from the source table to the target table.
    • If the appropriate multicolumn or multicolumn index statistics have been collected for both the target table and the source table, Vantage writes over the existing target table statistics with the statistics previously collected for the source table.
    • If the appropriate multicolumn or multicolumn index statistics have been collected for the target table, but not for the source table, Vantage does not change the existing target table statistics.
  • The following additional rule applies to copying only a subset of single-column statistics.

    If no statistics have been collected on the specified source table column, the request aborts and returns an error to the requestor.

  • The following additional rule applies to copying only a subset of multicolumn statistics.

    If statistics have not been collected on the specified source table multicolumn set, the request aborts and returns an error to the requestor.

  • The following additional rules apply to copying statistics for global temporary tables.
    • If both the source and target tables are global temporary tables, then the system copies the temporary statistics from the source table as temporary statistics on the target table.
    • If the source table is a global temporary table and the target table is a permanent table, then the system copies the temporary statistics from the source table as permanent statistics on the target table.
    • If the target table is a global temporary table and the source table is a permanent table, then the system copies the statistics from the source table as temporary statistics on the target table
    • If the target table is a base global temporary table, then the system copies the source table statistics as zeroed statistics on the target table.

      The term zeroed statistics refers to the condition in which the synopsis data structures, or histograms (see Teradata Vantage™ - SQL Request and Transaction Processing, B035-1142), for the statistics on a column set or index have been constructed, but no statistics have been collected on the column set or index. A common example of this state is when you create a new table and then collect statistics on it when it does not yet contain any data rows.

    • If the target table is an unmaterialized global temporary table, then the system materializes it before it copies temporary statistics from the source table as temporary statistics on the target table.