Collect Statistics on Current Data | Temporal Table Support | Teradata Vantage - Collecting Statistics on Current Data - Analytics Database - Teradata Vantage

Temporal Table Support

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2023-10-30
dita:mapPath
eud1628112402879.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
fif1472244754791
lifecycle
latest
Product Category
Teradata Vantage™

Statistics can be collected on current data at regular intervals. This is more efficient than collecting data on all data in the table, especially when new data is loaded, and global statistics are refreshed, which can require reading and aggregating the entire table, both current and historical data.

While global statistics are useful to help optimize queries that involve both current and historical data, collecting these statistics can occur at a higher threshold of data change than statistics collection limited to current data. For example, you could refresh statistics on current data whenever the change involves more than 10% of the data, while the threshold for global statistics collection could be 20%.

Example: Collecting statistics on current data

This example shows how statistics could be collected on a bitemporal table that has been partitioned to separate the current from invalid and closed data.

CREATE MULTISET TABLE Policy_BiTemp (
       Policy_ID INTEGER,
       Customer_ID INTEGER,
       Policy_Type CHAR(2) NOT NULL,
       Policy_Details CHAR(40),
       Validity PERIOD(DATE) NOT NULL AS VALIDTIME
       Policy_Duration PERIOD(TIMESTAMP(6) WITH TIME ZONE) NOT NULL
       AS TRANSACTIONTIME
       )
PRIMARY INDEX (Policy_ID)
PARTITION BY CASE_N(
(END(Validity) IS NULL OR
END(Validity) >= CURRENT_DATE -INTERVAL '2' DAY) AND
END(Policy_Duration) >= CURRENT_TIMESTAMP,
END(Validity) < CURRENT_DATE-INTERVAL '2' DAY AND
END(Policy_Duration) >= CURRENT_TIMESTAMP,
END(Policy_Duration) < CURRENT_TIMESTAMP);

The first partition created by the CASE_N expression is the current partition. Because most queries of the table will involve the current partition, statistics should be collected on the current data.

COLLECT STATISTICS
            COLUMN Policy_ID
           ,COLUMN Customer_ID
           ,COLUMN Policy_Type
                ON QUERY (SELECT Policy_ID,Customer_ID,Policy_Type
                            FROM Policy_BiTemp
                     WHERE (END(Validity) IS NULL OR
                            END(Validity) >= CURRENT_DATE -
                               INTERVAL '2' DAY)
                        AND END(Policy_Duration) >=
                                         CURRENT_TIMESTAMP
                   ) AS QS_Policy_Bitemp_Currdata;

For more information about COLLECT STATISTICS, see Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144.