16.10 - Collecting Statistics - Teradata Database

Teradata Database SQL Fundamentals

Product
Teradata Database
Release Number
16.10
Published
June 2017
Language
English (United States)
Last Update
2018-04-25
dita:mapPath
klx1480972732157.ditamap
dita:ditavalPath
changebar_rev_16_10_exclude_audience_ie.ditaval
dita:id
toy1472253184295

The COLLECT STATISTICS (Optimizer form) statement collects demographic data for one or more columns of a base table, hash index, or join index, computes a statistical profile of the collected data, and stores the synopsis in the Data Dictionary.

The Optimizer uses the synopsis data when it generates its table access and join plans.

Usage

You should collect statistics on newly created, empty data tables. An empty collection defines the columns, indexes, and synoptic data structure for loaded collections. You can easily collect statistics again after the table is populated for prototyping, and again when it is in production.

You can collect statistics on the following.

  • A unique index, which can be:
    • Primary or secondary
    • Single or multiple column
    • Partitioned or nonpartitioned
  • A nonunique index, which can be:
    • Primary or secondary
    • Single or multiple column
    • Partitioned or nonpartitioned
    • With or without COMPRESS fields
  • A non-indexed column or set of columns, which can be:
    • Partitioned or nonpartitioned
    • With or without COMPRESS fields
  • Join index
  • Hash index
  • NoPI table
  • A temporary table
    • If you specify the TEMPORARY keyword but a materialized table does not exist, the system first materializes an instance based on the column names and indexes you specify. This means that after a true instance is created, you can update (re-collect) statistics on the columns by entering COLLECT STATISTICS and the TEMPORARY keyword without having to specify the desired columns and index.
    • If you omit the TEMPORARY keyword but the table is a temporary table, statistics are collected for an empty base table rather than the materialized instance.
  • Sample (system-selected percentage) of the rows of a data table or index, to detect data skew and dynamically increase the sample size when found.
    • The system does not store both sampled and defined statistics for the same index or column set. Once sampled statistics have been collected, implicit re-collection hits the same columns and indexes, and operates in the same mode. To change this, specify any keywords or options and name the columns and/or indexes.

Related Topics

For more information about:

  • Using the COLLECT STATISTICS statement, see SQL Data Definition Language - Syntax and Examples, B035-1144.
  • Collecting statistics on a join index, collecting statistics on a hash index, or when to collect statistics on base table columns instead of hash index columns, see Database Design, B035-1094.
  • Database administration and collecting statistics, see Database Administration , B035-1093 .