Sizing a Query Capture Database - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-22
dita:mapPath
qby1588121512748.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantage™

Query Capture Databases are used to analyze SQL queries for ways in which they can be better optimized. A QCD can occupy significant storage space, so it is important to understand how much capacity is required to perform query optimization analyses.

Capacity Planning for a Query Capture Database

The following physical limits constrain the QCD functionality:
  • Because QCD tables are ordinary Teradata relational tables, they are limited to a row length of 1 MB.

    Remember that all row lengths must be an even number of bytes (see Byte Alignment), so be sure to take this into account.

  • QCDs draw their disk space from PERM space just like any other persistent Teradata relational tables. Consider defining the spool space for a QCD using a constant expression to scale the amount of spool space assigned based on the number of AMPs on the system.

    For example, you might specify the amount of spool space for a QCD as in the following CREATE DATABASE request.

         CREATE DATABASE qcd AS SPOOL = 2e5 * (HASHAMP() + 1);
    After creating the QCD, the spool space does not change if you add more AMPs to your system because the size is based on the number of AMPs in the system at the time the QCD is created, not on the current configuration.
  • The row length for tables in an aligned row format QCD is larger in order to align the rows on appropriate modulo(8) boundaries, just as it is for all other base table rows on an aligned row format system (see Row Structure for Aligned Row Format Systems).

You can estimate the magnitude of the raw data (without indexes) generated by a single INSERT EXPLAIN or DUMP EXPLAIN statement from the following equation:

Raw data size is approximately equal to (2n + M (t × d))

where:

Equation element … Specifies the …
 n size in bytes of the EXPLAIN modifier output for the same query.

The approximation assumes that all captured data demographics can be accommodated within 2n bytes.

m size in bytes of the SQL query text.
t average size in bytes of the DDL text for the object.
d number of tables and views in the query.

The approximation is slightly more complex when statistics are captured using INSERT EXPLAIN WITH STATISTICS. This equation applies only to INSERT EXPLAIN WITH STATISTICS. DUMP EXPLAIN does not capture statistics.

Raw data size is approximately equal to (2n + m + (t × d)) + 5000c

where:

Equation element … Specifies the …
n size in bytes of the EXPLAIN request modifier output for the same query.

The approximation assumes that all captured data demographics can be accommodated within 2n bytes.

m size in bytes of the SQL query text.
t average size in bytes of the DDL text for the object.
d number of tables and views in the query.
c number of columns involved in range and explicit equality conditions in the query WHERE clause.

If you do not specify WITH STATISTICS when you perform INSERT EXPLAIN, the value for c is 0.

The coefficient 5,000 is used as the average size of statistics based on the following information:
  • The 101 interval histograms used to store the statistics occupy 4,900 bytes per column. This count includes interval 0 in addition to intervals 1 through 100. Interval 0 contains column- or index-global statistics, while intervals 1 through 200 contain interval-specific column statistics.
  • 100 bytes are added to account for an upward bound on miscellaneous bytes per column.Storage overhead in bytes per column due to statistics is approximately equal to 4900 + 100 = 5000

Related Topics

For more information about query capture databases and query optimization analysis, see Teradata Vantage™ - SQL Request and Transaction Processing, B035-1142.