Collecting QCD Statistics on the PARTITION Column of a Table - Analytics Database - Teradata Vantage

SQL Data Manipulation Language

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-05
dita:mapPath
pon1628111750298.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
esx1472246586715
lifecycle
latest
Product Category
Teradata Vantage™

For information about why you should collect PARTITION statistics for the Optimizer, see Teradata Vantage™ - SQL Data Definition Language Detailed Topics, B035-1184.

Because PARTITION is not a reserved keyword, you can use it as the name for any column in a table definition, but you should not. This practice is strongly discouraged in all cases, and particularly for partitioned tables, because if you name a non-PARTITION column partition, the system resolves it as a regular column. As a result, you cannot collect statistics on the system-derived PARTITION column of any table that also has a user-defined column named partition.

If a table is partitioned by a single-column expression, its column statistics are inherited as PARTITION statistics. In this special case, you need not collect single-column PARTITION statistics.

For example, assume the following table definition:

     CREATE TABLE table_1 (
       col_1 INTEGER,
       col_2 INTEGER)
     PRIMARY INDEX(col_1) PARTITION BY col_2;

If you collect individual column statistics on col_2, which is the partitioning column, then those statistics are also inherited as PARTITION statistics.

You can also collect PARTITION statistics on NPPI tables, which quickly provides up to date statistics on the number of rows in the table. This is because the partition number for every row in an NPPI table is 0.

You can specify a USING SAMPLE clause to collect single-column PARTITION statistics, but the specification is ignored. The system automatically resets the sampling percentage to 100. The sampling field in a detailed statistics report is always reported as 0 to document this behavior. Note that all valid sampling percentages specified for a USING SAMPLE clause are honored for multicolumn PARTITION statistics.

There is a limit of 32 sets of multicolumn statistics that can be collected on a given table. Because single-column PARTITION statistics use index ids in the same range as multicolumn statistics (between 129 and 160, inclusive), collecting PARTITION statistics effectively reduces the limit on the number of multicolumn statistics that can be collected to 31.

The system ignores user-specified column ordering for multicolumn PARTITION statistics. This is consistent with non-PARTITION multicolumn statistics and multicolumn indexes. The columns are ordered based on their internal field id. Because the system-derived PARTITION column has field id value of 0, it always takes the first position in multicolumn statistics.