16.10 - Guidelines for Collecting Statistics - Teradata Database

Teradata Database Administration

Product
Teradata Database
Release Number
16.10
Release Date
April 2018
Content Type
Administration
Publication ID
B035-1093-161K
Language
English (United States)

Use the COLLECT STATISTICS statement to collect statistics on the columns used in predicates or GROUP BY clauses. When the table size grows more than 10%, you should recollect statistics on the table. Recollection covers all columns on which statistics were initially collected

Task Guideline
Collect UPI statistics on small tables If you collect no other statistics on the table, collect UPI statistics. The table is small (that is, 100 rows/AMP).
Collect NUPI statistics The NUPI is:
  • Fairly or highly unique, and
  • Used commonly in joins, or
  • Skewed
  • Small
Collect NUSI statistics on all NUSIs The Optimizer can use the NUSI in range scans (BETWEEN... AND...).

With statistics available, the system can decide to hash on the values in the range if demographics indicate that to do so would be less costly than a full table scan.

Without statistics, the index will likely not be used and, in some cases, the NUSI may be used when it is not efficient to do so.

Collect NUSI statistics on covering (ALL option) NUSIs If an SI is defined with the intent of covering queries (the ALL option is specified), you should consider collecting statistics even if the indexed columns do not appear in WHERE conditions.

Collecting statistics on a potentially covering NUSI provides the Optimizer with the total number of rows in the NUSI subtable and allows the Optimizer to make better decisions regarding the cost savings from covering.

Collect NUSI statistics on NUSIs with ORDER BY If a sort key is specified in a NUSI definition with the ORDER BY option, collect statistics on that column so the Optimizer can compare the cost of using a NUSI-based access path with a range or equality condition on the sort key column.
Collect non-index column statistics on all non-indexed columns used for table selection Collecting statistics on a group of columns allows the Optimizer to estimate the number of qualifying rows for queries that have search conditions on each of the columns or that have a join condition on each of the columns.
Refresh statistics after updates When:
  • The number of rows changed is greater than 10%.
  • For row partitioned tables >10% for any partition.
  • The demographics of columns with collected statistics changes.
Drop statistics If statistics are no longer needed because no queries are using them, they should be dropped.