Sizing a Query Capture Database - Teradata VantageCloud Lake

Lake - Database Reference

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ohi1683672393549.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
ohi1683672393549

Query Capture Databases are used to analyze SQL queries for better optimization. A QCD can occupy significant storage space, so it is important to understand how much capacity is required to perform query optimization analyses.

Capacity Planning for a Query Capture Database

The following physical limits constrain the QCD functionality:
  • QCD tables are ordinary Teradata relational tables, and are limited to a row length of 1 MB.

    Each row length must be an even number of bytes (see Byte Alignment), so be sure to take this into account.

  • QCDs draw their disk space from PERM space. Consider defining the spool space for a QCD using a constant expression to scale the amount of spool space assigned based on the number of AMPs on the system.

    For example, you may specify the amount of spool space for a QCD as in the following CREATE DATABASE request.

    CREATE DATABASE qcd AS SPOOL = 2e5 * (HASHAMP() + 1);
    After creating the QCD, the spool space does not change if you add more AMPs to your system because the size is based on the number of AMPs in the system at the time the QCD is created, not on the current configuration.
  • The row length for tables in an aligned row format QCD is larger to align the rows on appropriate modulo(8) boundaries (see Row Structure for Aligned Row Format Systems).

You can estimate the magnitude of the raw data (without indexes) generated by a single INSERT EXPLAIN or DUMP EXPLAIN statement from the following equation:

Raw data size is approximately equal to (2n + M (t × d))

where:

Equation Element Description
 n Specifies the size in bytes of the EXPLAIN modifier output for the same query.

The approximation assumes that all captured data demographics can be accommodated within 2n bytes.

m Specifies the size in bytes of the SQL query text.
t Specifies the average size in bytes of the DDL text for the object.
d Specifies the number of tables and views in the query.
INSERT EXPLAIN and DUMP EXPLAIN are supported only on the Block File System on the primary cluster, not the Object File System.

The approximation is more complex when statistics are captured using INSERT EXPLAIN WITH STATISTICS. This equation applies only to INSERT EXPLAIN WITH STATISTICS. DUMP EXPLAIN does not capture statistics.

Raw data size is approximately equal to (2n + m + (t × d)) + 5000c

where:

Equation Element Description
n Specifies the size in bytes of the EXPLAIN request modifier output for the same query.

The approximation assumes that all captured data demographics can be accommodated within 2n bytes.

m Specifies the size in bytes of the SQL query text.
t Specifies the average size in bytes of the DDL text for the object.
d Specifies the number of tables and views in the query.
c Specifies the number of columns involved in range and explicit equality conditions in the query WHERE clause.

If you do not specify WITH STATISTICS when you perform INSERT EXPLAIN, the value for c is 0.

The coefficient 5,000 is used as the average size of statistics based on the following information:
  • The 101 interval histograms used to store the statistics occupy 4,900 bytes per column. This count includes interval 0 and intervals 1 through 100. Interval 0 contains column- or index-global statistics, while intervals 1 through 200 contain interval-specific column statistics.
  • 100 bytes are added to account for an upward bound on miscellaneous bytes per column. Storage overhead in bytes per column due to statistics is approximately equal to 4900 + 100 = 5000

Related Information

For more information about query capture databases and query optimization analysis, see Query Optimizers .