INSERT EXPLAIN - Teradata Database

SQL Data Manipulation Language

Product
Teradata Database
Release Number
15.00
Language
English (United States)
Last Update
2018-09-28
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata® Database

INSERT EXPLAIN

Purpose  

Captures the Optimizer white tree information for a query and writes it to a user-defined query capture database, optionally as an XML document.

Syntax  

where:

 

Syntax element …

Specifies …

WITH STATISTICS

to capture the statistics for the WHERE clause condition columns specified in SQL_statement.

You cannot specify WITH STATISTICS if you also specify the IN XML option.

You cannot specify the WITH STATISTICS option if you submit an INSERT EXPLAIN request in Index Validation mode (see “DIAGNOSTIC “Validate Index”” on page 642).

Specifying this option is equivalent to performing COLLECT STATISTICS (QCD Form) on the query table columns.

The columns inserted into the QCD are those determined to be index candidates by the Teradata Index Wizard.

USING SAMPLE percentage

USING SAMPLE percentage PERCENT

the percentage of data to be read for collecting statistics.

If you do not specify this clause, the system automatically selects a sample size based on the cardinality of the table.

You cannot specify a USING SAMPLE clause if you also specify an IN XML clause.

The valid range is between 1 and 100, inclusive.

WITH NO STATISTICS

not to capture statistics for the WHERE clause condition columns specified in SQL_statement.

AND DEMOGRAPHICS

to collect table‑level demographics as well as statistics.

Specifying this option is equivalent to performing a COLLECT DEMOGRAPHICS request on the query tables. See “COLLECT DEMOGRAPHICS” on page 554.

If you do not specify this option, then the system does not collect table‑level demographics.

FOR table_name

FOR database_name.
table_name

FOR user_name.table_name

the set of tables for which data demographics and statistics details are either to be included or excluded for analysis by the Teradata Index Wizard, depending on whether you specify the WITH STATISTICS option or the WITH NO STATISTICS options, respectively.

Statistics are collected on columns that are index candidates and for all columns and indexes referenced explicitly with values in SQL_statement for the following cases. A candidate index column is defined as a column for which the value or join range access frequencies stored in the Field table of the specified QCD are greater than 0. If you specify:

  • WITH STATISTICS, Teradata Database collects statistics for all tables referenced by the query.
  • WITH STATISTICS … FOR table_name, Teradata Database collects statistics only for the tables referenced by the query that are also specified in the FOR table_name clause.
  • WITH NO STATISTICS … FOR table_name, Teradata Database collects statistics only for the tables referenced by the query that are not listed in the FOR table_name clause.
  • You cannot specify a FOR table_name clause if you also specify an IN XML clause.

    INTO QCD_name

    a user-defined query capture database.

    A database named QCD_name must exist on the system. Otherwise, the request aborts and returns an error.

    Using the Control Center feature of the Visual Explain tool is the easiest way to create your QCD databases.

    AS query_plan_name

    an optional user-defined name under which the query plan information is to be stored. For information on object naming, see SQL Fundamentals.

    If no query_plan_name is specified, the query plan information is stored with a null name. Because each query plan is stored with a unique non-null Query ID (see SQL Request and Transaction Processing), various query plans within a given database can be distinguished. Note that Query IDs are not unique across databases.

    query_plan_name is not constrained to be unique.

    You can store query_plan_name as query plan name if you enclose the pad character-separated words in APOSTROPHE characters as “query plan name”.

    LIMIT

    LIMIT SQL

    LIMIT SQL =n

    to place a limit on the size of the query, DDL, and view text captured for the QCD tables Query, Relation, and ViewText, respectively.

    If you do not specify this clause, then the system captures complete SQL text.

    The value for n indicates the upper limit on the amount of text to capture.

    If you specify either LIMIT by itself or LIMIT SQL without a value, then the clause is equivalent to LIMIT SQL = 0, and no text is captured.

    FOR frequency

    the number of times SQL_statement is typically performed within its identified workload.

    This value is used to weight the respective benefits of each column analyzed for inclusion in the index recommendation computed by the Teradata Index Wizard utility.

    Any positive integer up to 4 bytes is valid.

    You cannot specify a FOR frequency clause if you also specify an IN XML clause.

    If this clause is not specified, then the value for frequency defaults to 1.

    CHECK STATISTICS

    to capture COLLECT STATISTICS recommendations for SQL_statement in the StatsRecs QCD table. See SQL Request and Transaction Processing.

    You cannot specify CHECK STATISTICS if you also specify the IN XML option.

    IN XML

    to capture the output as an XML document.

    This document is stored in the QCD table named XMLQCD.

    You cannot specify an IN XML clause if you also specify any of the following options:

  • USING SAMPLE
  • FOR table_name
  • FOR frequency
  • If you specify the IN XML option, you can also specify the NODDLTEXT option with it.

    If you specify the IN XML option, you cannot specify any of the following options with it:

  • CHECK STATISTICS
  • WITH STATISTICS
  • NODDLTEXT

    not to capture the DDL text in the XML document stored in qcd.XMLQCD.

    SQL_request

    the DML request whose Optimizer white tree information is to be captured and written to the QCD.

    SQL_request is limited to the following statements.

  • DELETE
  • EXEC (Macro Form)
  • INSERT
  • MERGE
  • SELECT
  • UPDATE
  • ANSI Compliance

    INSERT EXPLAIN is a Teradata extension to the ANSI SQL:2011 standard.

    Required Privileges

    You must have INSERT privileges on the tables in the user-defined query capture database.

    You must also have the SELECT privilege on all referenced tables to capture data demographics and column statistics.

    Invocation

    Normally invoked using client‑based database query analysis tools.

    Rules for Using INSERT EXPLAIN

    The following rules apply to the use of INSERT EXPLAIN:

  • When you specify a table list, but none of the tables referenced by SQL_statement are in the specified table list, an error is returned.
  • When you specify a table list, but not all of the tables referenced by SQL_statement are in the specified table list, no error is returned, but the nonmatching tables are ignored.
  • INSERT EXPLAIN Actions Without An Existing QCD

    Ensure that there is an existing QCD_name database before running INSERT EXPLAIN for the first time. Use the Control Center feature of the Visual Explain tool to create your QCD databases. See Teradata Visual Explain User Guide for more information.

    If a QCD_name database or any of its tables does not exist, or if you do not have the appropriate privileges on one or more QCD_name database tables when you perform the INSERT EXPLAIN request, then the system displays an appropriate error message and terminates performance of the request.

    Creating Index Analysis Workloads Using the WITH STATISTICS and FOR frequency Options

    When you specify the WITH STATISTICS option with INSERT EXPLAIN, sample table cardinality estimates and column statistics are captured in the TableStatistics tables of your QCD databases. The confidence the Optimizer has in its index recommendations is expressed in the Level column of the QCD table StatsRecs. For details about the StatsRec table, see “Query Capture Facility” in SQL Request and Transaction Processing. This information is critical to the index recommendation analyses performed by the Teradata Index Wizard.

    Note that the sampled statistics captured using the WITH STATISTICS clause are identical to those captured by performing a COLLECT STATISTICS (QCD form) request and a COLLECT request. For more information, see “COLLECT STATISTICS (Optimizer Form)” in SQL Data Definition Language.

    The FOR frequency clause provides an estimate of the average number of times the SQL request being analyzed is typically performed during the submitted workload. This information is used by the Teradata Index Wizard utility to determine the benefit of including various columns as indexes.

    Collecting New Statistics When You Specify INSERT EXPLAIN WITH STATISTICS

    INSERT EXPLAIN WITH STATISTICS collects fresh, sampled statistics only if there are no statistics for the candidate index column set in DBC.TVFields for table_name. Otherwise, the existing statistics, whether based on a sample or not, are copied to the QCD and are not recollected. This enables the index analysis process to make full use of nonsampled statistics if they are available.

    Actions Performed by INSERT EXPLAIN

    INSERT EXPLAIN performs the following actions in the order indicated:

    1 Runs an EXPLAIN on the SQL DML request specified by SQL_statement to generate an Optimizer white tree.

    2 Captures the Optimizer white tree output of that EXPLAIN.

    3 Writes the output to the appropriate tables in the user-specified query capture database.

     

    IF you specify this option …

    THEN INSERT EXPLAIN also …

    USING SAMPLE

    collects statistics on the specified sample size from the population rather than allowing the system to select a sample size based on the cardinality of the table.

    CHECK STATISTICS

    generates a set of recommended statistics to collect and stores the recommendations in the StatsRec QCD table.

    You cannot specify either of these options if you also specify the IN XML option.

    If you are collecting statistics for use by the Teradata Index Wizard utility, then INSERT EXPLAIN performs the following actions in the order indicated:

    1 Runs an EXPLAIN (see “EXPLAIN Request Modifier” on page 572) on the SQL DML request specified by SQL_statement to generate an Optimizer white tree.

    2 Captures the Optimizer white tree output for that EXPLAIN.

    3 Captures the table cardinality and column statistics for the tables referenced by SQL_request, excluding any tables explicitly excluded by the FOR table_name clause. The columns are those identified as potential index candidates.

    4 Captures the number of times SQL_request is performed in the identified workload as specified by the FOR frequency clause.

    5 Writes the output to the appropriate tables in the user-specified query capture database.

    INSERT EXPLAIN Is Not Valid Within a Multistatement Request

    You cannot submit an INSERT EXPLAIN request as part of a multistatement request, though you can submit an INSERT EXPLAIN request for a multistatement request. If you attempt to submit a multistatement request that contains an INSERT EXPLAIN request, the multistatement request aborts and returns an error.

    Note that while the INSERT EXPLAIN request in the following example superficially appears to be a valid statement in a multistatement request, it actually captures the query plan for a multistatement request that follows it, and is not itself part of that multistatement request. As a result, the request is treated like any other INSERT EXPLAIN and completes successfully.

         INSERT EXPLAIN INTO qcd SELECT * FROM d1.t1
         ;SELECT * FROM d1.t1;
     
           *** Insert completed. One row added.
           *** Total elapsed time was 1 second.

    Effects of Data Parcel Peeking on INSERT EXPLAIN Outcomes

    When a data parcel is submitted with an INSERT EXPLAIN request, the plan might be generated with peeking at USING and CURRENT_DATE or DATE values, or both. If any of these values are peeked, then the query plan shows them.

    If no data parcel is submitted with an INSERT EXPLAIN request, the resulting plan is generated without peeking at USING, CURRENT_DATE, or DATE values, so it is a generic plan by definition. Note that the Visual Explain, Teradata System Emulation Tool, and Teradata Index Wizard tool do not accept USING data as input while capturing query plans using INSERT EXPLAIN requests unless those requests are submitted using BTEQ or Teradata SQL Assistant.

    The Teradata Index Wizard internally generates plans for workload queries for the purpose of estimating workload costs, which are used to determine optimal index recommendations. When queries in the workloads specify USING request modifiers, the plan is generated without peeking at USING, CURRENT_DATE, or DATE values. Because of these factors, Request Cache peeking has no impact on the resulting index. Given that workload analyses should be independent of USING values, this behavior is correct.

    Capturing the INSERT EXPLAIN Output as an XML Document

    The IN XML option for INSERT EXPLAIN provides a functionally equivalent representation of a query plan created by an INSERT EXPLAIN request that does not produce XML output.

    INSERT EXPLAIN operations that require several minutes to be processed when you do not specify the IN XML option take only a few seconds to complete when you capture the information as an XML file. This is because an INSERT EXPLAIN … IN XML request performs only one insert operation by storing the entire output of the request as one or more 31,000 byte slices of a single XML file. When INSERT EXPLAIN inserts XML data into the QCD table XMLQCD, it also updates the QCD table SeqNumber, but the update of SeqNumber is a trivial operation.

    When you specify the IN XML option, Teradata Database does not update any other QCD tables beyond XMLQCD and SeqNumber. This is unlike a standard INSERT EXPLAIN request where inserts must be made into multiple QCD tables.

    In a single QCD table, XMLQCD has a marked performance advantage over a standard INSERT EXPLAIN QCD data collection operation. For the definition of the XMLQCD table, see SQL Request and Transaction Processing

    A similar option for the BEGIN QUERY LOGGING and REPLACE QUERY LOGGING statements optionally creates a compact XML version of QCD information that can be stored in a single DBQL table, DBC.DBQLXMLTbl, making the features complementary to one another.

    XMLPLAN query logging is not an alternative method for capturing the information captured by INSERT EXPLAIN requests.

  • In addition to differences in the content of the documents produced, these two methods have an important difference regarding the execution of the query.
  •  

    BEGIN QUERY LOGGING … XMLPLAN

    INSERT EXPLAIN …

    logs query plans for executed queries.

    captures query plans without executing the query.

    XMLPLAN logging is ideal when you want to record query plans for your executing workloads and using INSERT EXPLAIN requests to capture query plans for those workloads is too slow for your needs.

    If you are only tuning a query and do not want to execute it, XMLPLAN logging is not as useful as capturing the query plan for a request using INSERT EXPLAIN requests.

    In this case, executing an INSERT EXPLAIN INTO QCD_name IN XML request or an EXPLAIN IN XML SQL_request (see “Capturing EXPLAIN Text Output in XML Format” on page 574) is a more viable alternative.

  • Runtime information from the traditional DBQL tables is also captured for a logged plan.
  • XMLPLAN logging is more of an extension to query logging than an extension to the Query Capture Facility.
  • Examples  

    The following examples, with some exceptions, use the same SELECT request to show how the different syntax options of INSERT EXPLAIN produce different results.

    Each INSERT EXPLAIN request inserts the data for the query plan produced by the Optimizer for the SELECT request that it modifies directly into the specified QCD database (see “Query Capture Facility” in SQL Data Definition Language for documentation of the QCD tables).

    Example : Query Plan Name Stated, QCD Name Is TLE_Queries

    The output of this INSERT EXPLAIN request is referenced under the query plan name EmployeeSmithQuery in the TLE_Queries database. The system collects neither statistics nor demographics for this request.

         INSERT EXPLAIN 
         INTO TLE_Queries AS EmployeeSmithQuery
         SELECT emp_id, emp_address
         FROM employee
         WHERE emp_name = 'Smith';

    Example : No Query Plan Stated, So Name in TLE_Queries QCD Is Null

    The output of this INSERT EXPLAIN request has a null query plan name in the TLE_Queries database because there is no query plan naming clause. The system collects neither statistics nor demographics for this request.

         INSERT EXPLAIN 
         INTO TLE_Queries
         SELECT emp_id, emp_address
         FROM employee
         WHERE emp_name = 'Smith';

    Example : Query Plan Name Stated, QCD Name Is QCD

    The output of this INSERT EXPLAIN request is referenced under the query plan name Employee Smith Query in the QCD database. The system collects neither statistics nor demographics for this request.

         INSERT EXPLAIN 
         INTO QCD AS "Employee Smith Query"
         SELECT emp_id, emp_address
         FROM employee
         WHERE emp_name = 'Smith';

    Example : Workload Execution Frequency Clause Specified

    The output of this INSERT EXPLAIN request is referenced under the query plan name Wizard Test in the Wizard_QCD database. Statistics are collected for table_1 and column statistics are collected for column_2 and column_3 in table_1. The frequency of performance of the specified SQL SELECT request is 10 times per workload. No demographics are collected.

         INSERT EXPLAIN WITH STATISTICS FOR table_1 
         INTO Wizard_QCD AS "Wizard Test" FOR 10
         SELECT *
         FROM table_1
         WHERE table_1.column_2 = 10
         AND table_1.column_3 = 20
         FOR 10;

    Example  

    The output of this INSERT EXPLAIN request is saved under the query plan name WizardTest2 in the Wizard_QCD database. Statistics are collected for the employee table and column statistics are collected for the emp_lastname column. No statistics are captured for the department table because the request specifies they are to be collected only for the employee table in the FOR table_name clause.

         INSERT EXPLAIN WITH STATISTICS FOR employee 
         INTO Wizard_QCD AS WizardTest2
         SELECT employee.first_name, department.dept_num
         FROM employee, department
         WHERE employee.emp_lastname = 'Smith' 
         AND   department.dept_name IN ('Engineering', 'Support');

    Example : Request Failure Because There Is No Reference To The Table Named In The Table List In The Specified SQL Query

    This INSERT EXPLAIN request returns an error because none of the tables specified in the table list are referenced in the specified query.

         INSERT EXPLAIN WITH STATISTICS FOR employee1 
         INTO Wizard_QCD AS WizardTest2
         SELECT employee.first_name, department.dept_num
         FROM employee, department
         WHERE employee.emp_lastname = 'Smith' 
         AND department.dept_name IN ('Engineering', 'Support');
     
        ****Failure 5644: No match found for specified tables in the request.

    Example : Specifying Statistical Sampling

    This example uses an explicit sampling percentage of 80 to capture QCD statistics for the specified SELECT request:

         INSERT EXPLAIN WITH STATISTICS
         USING SAMPLE 80 PERCENT 
         INTO MyQCD AS query1
         SELECT t2.y3, t1.x3  
         FROM t1, t2  
         WHERE  t1.pi = t2.y2;

    Example : Generating Single‑Column Statistics

    Single‑column statistics recommendations are represented by a single row in the StatsRecs table. In this example there are two separate recommendations, one for the primary index on column a1 and one for the nonindexed column c1.

    After you run this request, you can then apply the generated recommendations for collecting statistics by running the two COLLECT STATISTICS requests that the system stores in the StatsDDL column of the StatsRec table in the QCD database.

    The definition for the only table referenced by the SQL request to be analyzed is as follows:

         CREATE TABLE t1 (
           a1 INTEGER, 
           b1 INTEGER, 
           c1 INTEGER);

    The following INSERT EXPLAIN request generates the single‑column statistics recommendations for the SQL request SELECT * FROM t1 WHERE c1=5.

         INSERT EXPLAIN INTO QCD 
         CHECK STATISTICS 
         SELECT * FROM t1 WHERE c1 = 5;

    The following SELECT request selects all the columns that contain the statistics recommendations for QueryID 1, which is the system‑assigned unique identifier for the row set in the QCD StatsRecs table that contains the recommendations:

         SELECT * 
         FROM QCD.StatsRecs 
         WHERE QueryID = 1 
         ORDER BY StatsID;

    The query returns the following report:

         QueryID           1
         StatsID           0
    DatabaseName D1
       TableName t1
         FieldID        1025
       FieldName A1
          Weight           3
        StatsDDL COLLECT STATISTICS D1.t1 COLUMN A1 ;
         QueryID           1
         StatsID           1
    DatabaseName D1
       TableName t1
         FieldID        1027
       FieldName C1
          Weight           3
        StatsDDL COLLECT STATISTICS D1.t1 COLUMN C1 ;

    Example : Generating Multicolumn Statistics

    Multicolumn statistics recommendations are represented by multiple rows in the StatsRecs table. Each column of the recommendation is stored in a separate row, sharing its StatsID value with the other columns in the set, each of which is represented by its own row.

    Note how the COLLECT STATISTICS request text is stored in the row corresponding to the lowest FieldID value. The StatsDDL value for the other rows is null. In this example, there is one recommendation for the multicolumn primary index.

    After you run this request, you can then apply the generated recommendations for collecting statistics by running the COLLECT STATISTICS request that the system stores in the StatsDDL column of the StatsRec table in the QCD database.

    The definition for the only table referenced by the SQL request to be analyzed is as follows:

         CREATE TABLE t2 (
           a2 INTEGER, 
           b2 INTEGER, 
           c2 INTEGER, 
           d2 INTEGER) 
         PRIMARY INDEX (a2, b2, c2);

    The following INSERT EXPLAIN request generates the multicolumn statistics recommendations for the SQL request SELECT * FROM t2.

         INSERT EXPLAIN INTO QCD 
         CHECK STATISTICS 
         SELECT * FROM t2;

    The following SELECT request selects all the columns that contain statistics recommendations for QueryID 2, which is the system‑assigned unique identifier for the row set in the QCD StatsRecs table that contains the recommendations:

         SELECT * 
         FROM QCD.StatsRecs 
         WHERE QueryID = 2 
         ORDER BY StatsID, FieldID;

    The query returns the following report:

         QueryID           2
         StatsID           0
    DatabaseName D1
       TableName t2
         FieldID        1025
       FieldName A2
          Weight           3
        StatsDDL COLLECT STATISTICS D1.t2 COLUMN (A2 ,B2 ,C2 ) ;
         QueryID           2
         StatsID           0
    DatabaseName D1
       TableName t2
         FieldID        1026
       FieldName B2
          Weight           3
        StatsDDL ?
         QueryID           2
         StatsID           0
    DatabaseName D1
       TableName t2
         FieldID        1027
       FieldName C2
          Weight           3
        StatsDDL ?

    There are three rows in the report: one for each of the columns that make up the primary index for the table. You can tell these are multicolumn statistics because all three rows share the same QueryId value (2) and the same StatsID value (0). Another sign that these are multicolumn statistics is that there is only one column of DDL text, and it is stored in the row with the lowest FieldID value (1025).

    Example : Capturing Output Data as an XML Document

    This example produces one row of query capture output and stores it in XML format in the QCD table myqcd.XMLQCD.

         INSERT EXPLAIN INTO myqcd IN XML 
         SELECT * 
         FROM DBC.DBCInfoV;

    The following output shows a fragment of the row inserted into myqcd.XMLQCD table.

                 Id            1
               Kind            Q
                Seq            1
             Length         4501
               Text  <?xml version="1.0" encoding="UTF-8"?><QCF>
    <ElapsedSeconds>0.030<ElapsedSeconds><QueryID>8<QueryID>
    <User_Database><UDB_Key>1</UDB_Key><UDB_ID>256</UDB_ID>
    <MachineName>localhost</MachineName><UDB_Name>DBC</UDB_Name>
    </User_Database><Query><UDB_Key>1</UDB_Key><MachName>localhost
    </MachName><NumAMPs>2</NumAMPs><NumPEs>1<NumPEs><NumNodes>1
    <NumNodes><ReleaseInfo>13w.00.00.00</ReleaseInfo><VersionInfo>
    13w.00.00.00</VersionInfo>…

    Example : Capturing Output Data as an XML Document With No Associated DDL Text

    This example produces one row of query capture output and stores it in XML format in the QCD table myqcd.XMLQCD. Because you specified the NODDLTEXT option, the system does not store the DDL SQL text related to the request.

         INSERT EXPLAIN INTO myqcd IN XML NODDLTEXT
         SELECT * 
         FROM DBC.DBCInfoV;

    For More Information

    For more information about index analysis, see:

  • “BEGIN QUERY LOGGING” and “REPLACE QUERY LOGGING” in SQL Data Definition Language.
  • “COLLECT DEMOGRAPHICS” on page 554
  • “COLLECT STATISTICS (QCD Form)” on page 557
  • “DUMP EXPLAIN” on page 567
  • “EXPLAIN Request Modifier” on page 572
  • “INITIATE INDEX ANALYSIS” on page 579
  • “RESTART INDEX ANALYSIS” on page 609
  • SQL Request and Transaction Processing
  • Database Design
  • Teradata Index Wizard User Guide
  • Teradata Viewpoint User Guide, chapter on Stats Manager
  • Teradata Visual Explain User Guide
  • Teradata System Emulation Tool User Guide