17.05 - Collecting QCD Statistics on the PARTITION Column of a Table - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL Data Manipulation Language

Advanced SQL Engine
Teradata Database
Release Number
January 2021
English (United States)
Last Update

For information about why you should collect PARTITION statistics for the Optimizer, see Teradata Vantage™ - SQL Data Definition Language Detailed Topics , B035-1184 .

Because PARTITION is not a reserved keyword, you can use it as the name for any column in a table definition, but you should not. This practice is strongly discouraged in all cases, and particularly for partitioned tables, because if you name a non-PARTITION column partition, the system resolves it as a regular column. As a result, you cannot collect statistics on the system-derived PARTITION column of any table that also has a user-defined column named partition.

If a table is partitioned by a single-column expression, its column statistics are inherited as PARTITION statistics. In this special case, you need not collect single-column PARTITION statistics.

For example, assume the following table definition:

     CREATE TABLE table_1 (
       col_1 INTEGER,
       col_2 INTEGER)

If you collect individual column statistics on col_2, which is the partitioning column, then those statistics are also inherited as PARTITION statistics.

You can also collect PARTITION statistics on NPPI tables, which quickly provides up to date statistics on the number of rows in the table. This is because the partition number for every row in an NPPI table is 0.

You can specify a USING SAMPLE clause to collect single-column PARTITION statistics, but the specification is ignored. The system automatically resets the sampling percentage to 100. The sampling field in a detailed statistics report is always reported as 0 to document this behavior. Note that all valid sampling percentages specified for a USING SAMPLE clause are honored for multicolumn PARTITION statistics.

There is a limit of 32 sets of multicolumn statistics that can be collected on a given table. Because single-column PARTITION statistics use index ids in the same range as multicolumn statistics (between 129 and 160, inclusive), collecting PARTITION statistics effectively reduces the limit on the number of multicolumn statistics that can be collected to 31.

The system ignores user-specified column ordering for multicolumn PARTITION statistics. This is consistent with non-PARTITION multicolumn statistics and multicolumn indexes. The columns are ordered based on their internal field id. Because the system-derived PARTITION column has field id value of 0, it always takes the first position in multicolumn statistics.