INSERT EXPLAIN
Purpose
Captures the Optimizer white tree information for a query and writes it to a user-defined query capture database, optionally as an XML document.
Syntax
where:
Syntax element … |
Specifies … |
WITH STATISTICS |
to capture the statistics for the WHERE clause condition columns specified in SQL_statement. You cannot specify WITH STATISTICS if you also specify the IN XML option. You cannot specify the WITH STATISTICS option if you submit an INSERT EXPLAIN request in Index Validation mode (see “DIAGNOSTIC “Validate Index”” on page 642). Specifying this option is equivalent to performing COLLECT STATISTICS (QCD Form) on the query table columns. The columns inserted into the QCD are those determined to be index candidates by the Teradata Index Wizard. |
USING SAMPLE percentage USING SAMPLE percentage PERCENT |
the percentage of data to be read for collecting statistics. If you do not specify this clause, the system automatically selects a sample size based on the cardinality of the table. You cannot specify a USING SAMPLE clause if you also specify an IN XML clause. The valid range is between 1 and 100, inclusive. |
WITH NO STATISTICS |
not to capture statistics for the WHERE clause condition columns specified in SQL_statement. |
AND DEMOGRAPHICS |
to collect table‑level demographics as well as statistics. Specifying this option is equivalent to performing a COLLECT DEMOGRAPHICS request on the query tables. See “COLLECT DEMOGRAPHICS” on page 554. If you do not specify this option, then the system does not collect table‑level demographics. |
FOR table_name FOR database_name. FOR user_name.table_name |
the set of tables for which data demographics and statistics details are either to be included or excluded for analysis by the Teradata Index Wizard, depending on whether you specify the WITH STATISTICS option or the WITH NO STATISTICS options, respectively. Statistics are collected on columns that are index candidates and for all columns and indexes referenced explicitly with values in SQL_statement for the following cases. A candidate index column is defined as a column for which the value or join range access frequencies stored in the Field table of the specified QCD are greater than 0. If you specify: You cannot specify a FOR table_name clause if you also specify an IN XML clause. |
INTO QCD_name |
a user-defined query capture database. A database named QCD_name must exist on the system. Otherwise, the request aborts and returns an error. Using the Control Center feature of the Visual Explain tool is the easiest way to create your QCD databases. |
AS query_plan_name |
an optional user-defined name under which the query plan information is to be stored. For information on object naming, see SQL Fundamentals. If no query_plan_name is specified, the query plan information is stored with a null name. Because each query plan is stored with a unique non-null Query ID (see SQL Request and Transaction Processing), various query plans within a given database can be distinguished. Note that Query IDs are not unique across databases. query_plan_name is not constrained to be unique. You can store query_plan_name as query plan name if you enclose the pad character-separated words in APOSTROPHE characters as “query plan name”. |
LIMIT LIMIT SQL LIMIT SQL =n |
to place a limit on the size of the query, DDL, and view text captured for the QCD tables Query, Relation, and ViewText, respectively. If you do not specify this clause, then the system captures complete SQL text. The value for n indicates the upper limit on the amount of text to capture. If you specify either LIMIT by itself or LIMIT SQL without a value, then the clause is equivalent to LIMIT SQL = 0, and no text is captured. |
FOR frequency |
the number of times SQL_statement is typically performed within its identified workload. This value is used to weight the respective benefits of each column analyzed for inclusion in the index recommendation computed by the Teradata Index Wizard utility. Any positive integer up to 4 bytes is valid. You cannot specify a FOR frequency clause if you also specify an IN XML clause. If this clause is not specified, then the value for frequency defaults to 1. |
CHECK STATISTICS |
to capture COLLECT STATISTICS recommendations for SQL_statement in the StatsRecs QCD table. See SQL Request and Transaction Processing. You cannot specify CHECK STATISTICS if you also specify the IN XML option. |
IN XML |
to capture the output as an XML document. This document is stored in the QCD table named XMLQCD. You cannot specify an IN XML clause if you also specify any of the following options: If you specify the IN XML option, you can also specify the NODDLTEXT option with it. If you specify the IN XML option, you cannot specify any of the following options with it: |
NODDLTEXT |
not to capture the DDL text in the XML document stored in qcd.XMLQCD. |
SQL_request |
the DML request whose Optimizer white tree information is to be captured and written to the QCD. SQL_request is limited to the following statements. |
ANSI Compliance
INSERT EXPLAIN is a Teradata extension to the ANSI SQL:2011 standard.
Required Privileges
You must have INSERT privileges on the tables in the user-defined query capture database.
You must also have the SELECT privilege on all referenced tables to capture data demographics and column statistics.
Invocation
Normally invoked using client‑based database query analysis tools.
Rules for Using INSERT EXPLAIN
The following rules apply to the use of INSERT EXPLAIN:
INSERT EXPLAIN Actions Without An Existing QCD
Ensure that there is an existing QCD_name database before running INSERT EXPLAIN for the first time. Use the Control Center feature of the Visual Explain tool to create your QCD databases. See Teradata Visual Explain User Guide for more information.
If a QCD_name database or any of its tables does not exist, or if you do not have the appropriate privileges on one or more QCD_name database tables when you perform the INSERT EXPLAIN request, then the system displays an appropriate error message and terminates performance of the request.
Creating Index Analysis Workloads Using the WITH STATISTICS and FOR frequency Options
When you specify the WITH STATISTICS option with INSERT EXPLAIN, sample table cardinality estimates and column statistics are captured in the TableStatistics tables of your QCD databases. The confidence the Optimizer has in its index recommendations is expressed in the Level column of the QCD table StatsRecs. For details about the StatsRec table, see “Query Capture Facility” in SQL Request and Transaction Processing. This information is critical to the index recommendation analyses performed by the Teradata Index Wizard.
Note that the sampled statistics captured using the WITH STATISTICS clause are identical to those captured by performing a COLLECT STATISTICS (QCD form) request and a COLLECT request. For more information, see “COLLECT STATISTICS (Optimizer Form)” in SQL Data Definition Language.
The FOR frequency clause provides an estimate of the average number of times the SQL request being analyzed is typically performed during the submitted workload. This information is used by the Teradata Index Wizard utility to determine the benefit of including various columns as indexes.
Collecting New Statistics When You Specify INSERT EXPLAIN WITH STATISTICS
INSERT EXPLAIN WITH STATISTICS collects fresh, sampled statistics only if there are no statistics for the candidate index column set in DBC.TVFields for table_name. Otherwise, the existing statistics, whether based on a sample or not, are copied to the QCD and are not recollected. This enables the index analysis process to make full use of nonsampled statistics if they are available.
Actions Performed by INSERT EXPLAIN
INSERT EXPLAIN performs the following actions in the order indicated:
1 Runs an EXPLAIN on the SQL DML request specified by SQL_statement to generate an Optimizer white tree.
2 Captures the Optimizer white tree output of that EXPLAIN.
3 Writes the output to the appropriate tables in the user-specified query capture database.
IF you specify this option … |
THEN INSERT EXPLAIN also … |
USING SAMPLE |
collects statistics on the specified sample size from the population rather than allowing the system to select a sample size based on the cardinality of the table. |
CHECK STATISTICS |
generates a set of recommended statistics to collect and stores the recommendations in the StatsRec QCD table. |
You cannot specify either of these options if you also specify the IN XML option.
If you are collecting statistics for use by the Teradata Index Wizard utility, then INSERT EXPLAIN performs the following actions in the order indicated:
1 Runs an EXPLAIN (see “EXPLAIN Request Modifier” on page 572) on the SQL DML request specified by SQL_statement to generate an Optimizer white tree.
2 Captures the Optimizer white tree output for that EXPLAIN.
3 Captures the table cardinality and column statistics for the tables referenced by SQL_request, excluding any tables explicitly excluded by the FOR table_name clause. The columns are those identified as potential index candidates.
4 Captures the number of times SQL_request is performed in the identified workload as specified by the FOR frequency clause.
5 Writes the output to the appropriate tables in the user-specified query capture database.
INSERT EXPLAIN Is Not Valid Within a Multistatement Request
You cannot submit an INSERT EXPLAIN request as part of a multistatement request, though you can submit an INSERT EXPLAIN request for a multistatement request. If you attempt to submit a multistatement request that contains an INSERT EXPLAIN request, the multistatement request aborts and returns an error.
Note that while the INSERT EXPLAIN request in the following example superficially appears to be a valid statement in a multistatement request, it actually captures the query plan for a multistatement request that follows it, and is not itself part of that multistatement request. As a result, the request is treated like any other INSERT EXPLAIN and completes successfully.
INSERT EXPLAIN INTO qcd SELECT * FROM d1.t1
;SELECT * FROM d1.t1;
*** Insert completed. One row added.
*** Total elapsed time was 1 second.
Effects of Data Parcel Peeking on INSERT EXPLAIN Outcomes
When a data parcel is submitted with an INSERT EXPLAIN request, the plan might be generated with peeking at USING and CURRENT_DATE or DATE values, or both. If any of these values are peeked, then the query plan shows them.
If no data parcel is submitted with an INSERT EXPLAIN request, the resulting plan is generated without peeking at USING, CURRENT_DATE, or DATE values, so it is a generic plan by definition. Note that the Visual Explain, Teradata System Emulation Tool, and Teradata Index Wizard tool do not accept USING data as input while capturing query plans using INSERT EXPLAIN requests unless those requests are submitted using BTEQ or Teradata SQL Assistant.
The Teradata Index Wizard internally generates plans for workload queries for the purpose of estimating workload costs, which are used to determine optimal index recommendations. When queries in the workloads specify USING request modifiers, the plan is generated without peeking at USING, CURRENT_DATE, or DATE values. Because of these factors, Request Cache peeking has no impact on the resulting index. Given that workload analyses should be independent of USING values, this behavior is correct.
Capturing the INSERT EXPLAIN Output as an XML Document
The IN XML option for INSERT EXPLAIN provides a functionally equivalent representation of a query plan created by an INSERT EXPLAIN request that does not produce XML output.
INSERT EXPLAIN operations that require several minutes to be processed when you do not specify the IN XML option take only a few seconds to complete when you capture the information as an XML file. This is because an INSERT EXPLAIN … IN XML request performs only one insert operation by storing the entire output of the request as one or more 31,000 byte slices of a single XML file. When INSERT EXPLAIN inserts XML data into the QCD table XMLQCD, it also updates the QCD table SeqNumber, but the update of SeqNumber is a trivial operation.
When you specify the IN XML option, Teradata Database does not update any other QCD tables beyond XMLQCD and SeqNumber. This is unlike a standard INSERT EXPLAIN request where inserts must be made into multiple QCD tables.
In a single QCD table, XMLQCD has a marked performance advantage over a standard INSERT EXPLAIN QCD data collection operation. For the definition of the XMLQCD table, see SQL Request and Transaction Processing
A similar option for the BEGIN QUERY LOGGING and REPLACE QUERY LOGGING statements optionally creates a compact XML version of QCD information that can be stored in a single DBQL table, DBC.DBQLXMLTbl, making the features complementary to one another.
XMLPLAN query logging is not an alternative method for capturing the information captured by INSERT EXPLAIN requests.
BEGIN QUERY LOGGING … XMLPLAN |
INSERT EXPLAIN … |
logs query plans for executed queries. |
captures query plans without executing the query. |
XMLPLAN logging is ideal when you want to record query plans for your executing workloads and using INSERT EXPLAIN requests to capture query plans for those workloads is too slow for your needs.
If you are only tuning a query and do not want to execute it, XMLPLAN logging is not as useful as capturing the query plan for a request using INSERT EXPLAIN requests.
In this case, executing an INSERT EXPLAIN INTO QCD_name IN XML request or an EXPLAIN IN XML SQL_request (see “Capturing EXPLAIN Text Output in XML Format” on page 574) is a more viable alternative.
Examples
The following examples, with some exceptions, use the same SELECT request to show how the different syntax options of INSERT EXPLAIN produce different results.
Each INSERT EXPLAIN request inserts the data for the query plan produced by the Optimizer for the SELECT request that it modifies directly into the specified QCD database (see “Query Capture Facility” in SQL Data Definition Language for documentation of the QCD tables).
Example : Query Plan Name Stated, QCD Name Is TLE_Queries
The output of this INSERT EXPLAIN request is referenced under the query plan name EmployeeSmithQuery in the TLE_Queries database. The system collects neither statistics nor demographics for this request.
INSERT EXPLAIN
INTO TLE_Queries AS EmployeeSmithQuery
SELECT emp_id, emp_address
FROM employee
WHERE emp_name = 'Smith';
Example : No Query Plan Stated, So Name in TLE_Queries QCD Is Null
The output of this INSERT EXPLAIN request has a null query plan name in the TLE_Queries database because there is no query plan naming clause. The system collects neither statistics nor demographics for this request.
INSERT EXPLAIN
INTO TLE_Queries
SELECT emp_id, emp_address
FROM employee
WHERE emp_name = 'Smith';
Example : Query Plan Name Stated, QCD Name Is QCD
The output of this INSERT EXPLAIN request is referenced under the query plan name Employee Smith Query in the QCD database. The system collects neither statistics nor demographics for this request.
INSERT EXPLAIN
INTO QCD AS "Employee Smith Query"
SELECT emp_id, emp_address
FROM employee
WHERE emp_name = 'Smith';
Example : Workload Execution Frequency Clause Specified
The output of this INSERT EXPLAIN request is referenced under the query plan name Wizard Test in the Wizard_QCD database. Statistics are collected for table_1 and column statistics are collected for column_2 and column_3 in table_1. The frequency of performance of the specified SQL SELECT request is 10 times per workload. No demographics are collected.
INSERT EXPLAIN WITH STATISTICS FOR table_1
INTO Wizard_QCD AS "Wizard Test" FOR 10
SELECT *
FROM table_1
WHERE table_1.column_2 = 10
AND table_1.column_3 = 20
FOR 10;
Example
The output of this INSERT EXPLAIN request is saved under the query plan name WizardTest2 in the Wizard_QCD database. Statistics are collected for the employee table and column statistics are collected for the emp_lastname column. No statistics are captured for the department table because the request specifies they are to be collected only for the employee table in the FOR table_name clause.
INSERT EXPLAIN WITH STATISTICS FOR employee
INTO Wizard_QCD AS WizardTest2
SELECT employee.first_name, department.dept_num
FROM employee, department
WHERE employee.emp_lastname = 'Smith'
AND department.dept_name IN ('Engineering', 'Support');
Example : Request Failure Because There Is No Reference To The Table Named In The Table List In The Specified SQL Query
This INSERT EXPLAIN request returns an error because none of the tables specified in the table list are referenced in the specified query.
INSERT EXPLAIN WITH STATISTICS FOR employee1
INTO Wizard_QCD AS WizardTest2
SELECT employee.first_name, department.dept_num
FROM employee, department
WHERE employee.emp_lastname = 'Smith'
AND department.dept_name IN ('Engineering', 'Support');
****Failure 5644: No match found for specified tables in the request.
Example : Specifying Statistical Sampling
This example uses an explicit sampling percentage of 80 to capture QCD statistics for the specified SELECT request:
INSERT EXPLAIN WITH STATISTICS
USING SAMPLE 80 PERCENT
INTO MyQCD AS query1
SELECT t2.y3, t1.x3
FROM t1, t2
WHERE t1.pi = t2.y2;
Example : Generating Single‑Column Statistics
Single‑column statistics recommendations are represented by a single row in the StatsRecs table. In this example there are two separate recommendations, one for the primary index on column a1 and one for the nonindexed column c1.
After you run this request, you can then apply the generated recommendations for collecting statistics by running the two COLLECT STATISTICS requests that the system stores in the StatsDDL column of the StatsRec table in the QCD database.
The definition for the only table referenced by the SQL request to be analyzed is as follows:
CREATE TABLE t1 (
a1 INTEGER,
b1 INTEGER,
c1 INTEGER);
The following INSERT EXPLAIN request generates the single‑column statistics recommendations
for the SQL request SELECT * FROM t1 WHERE c1=5
.
INSERT EXPLAIN INTO QCD
CHECK STATISTICS
SELECT * FROM t1 WHERE c1 = 5;
The following SELECT request selects all the columns that contain the statistics recommendations for QueryID 1, which is the system‑assigned unique identifier for the row set in the QCD StatsRecs table that contains the recommendations:
SELECT *
FROM QCD.StatsRecs
WHERE QueryID = 1
ORDER BY StatsID;
The query returns the following report:
QueryID 1
StatsID 0
DatabaseName D1
TableName t1
FieldID 1025
FieldName A1
Weight 3
StatsDDL COLLECT STATISTICS D1.t1 COLUMN A1 ;
QueryID 1
StatsID 1
DatabaseName D1
TableName t1
FieldID 1027
FieldName C1
Weight 3
StatsDDL COLLECT STATISTICS D1.t1 COLUMN C1 ;
Example : Generating Multicolumn Statistics
Multicolumn statistics recommendations are represented by multiple rows in the StatsRecs table. Each column of the recommendation is stored in a separate row, sharing its StatsID value with the other columns in the set, each of which is represented by its own row.
Note how the COLLECT STATISTICS request text is stored in the row corresponding to the lowest FieldID value. The StatsDDL value for the other rows is null. In this example, there is one recommendation for the multicolumn primary index.
After you run this request, you can then apply the generated recommendations for collecting statistics by running the COLLECT STATISTICS request that the system stores in the StatsDDL column of the StatsRec table in the QCD database.
The definition for the only table referenced by the SQL request to be analyzed is as follows:
CREATE TABLE t2 (
a2 INTEGER,
b2 INTEGER,
c2 INTEGER,
d2 INTEGER)
PRIMARY INDEX (a2, b2, c2);
The following INSERT EXPLAIN request generates the multicolumn statistics recommendations
for the SQL request SELECT * FROM t2
.
INSERT EXPLAIN INTO QCD
CHECK STATISTICS
SELECT * FROM t2;
The following SELECT request selects all the columns that contain statistics recommendations for QueryID 2, which is the system‑assigned unique identifier for the row set in the QCD StatsRecs table that contains the recommendations:
SELECT *
FROM QCD.StatsRecs
WHERE QueryID = 2
ORDER BY StatsID, FieldID;
The query returns the following report:
QueryID 2
StatsID 0
DatabaseName D1
TableName t2
FieldID 1025
FieldName A2
Weight 3
StatsDDL COLLECT STATISTICS D1.t2 COLUMN (A2 ,B2 ,C2 ) ;
QueryID 2
StatsID 0
DatabaseName D1
TableName t2
FieldID 1026
FieldName B2
Weight 3
StatsDDL ?
QueryID 2
StatsID 0
DatabaseName D1
TableName t2
FieldID 1027
FieldName C2
Weight 3
StatsDDL ?
There are three rows in the report: one for each of the columns that make up the primary index for the table. You can tell these are multicolumn statistics because all three rows share the same QueryId value (2) and the same StatsID value (0). Another sign that these are multicolumn statistics is that there is only one column of DDL text, and it is stored in the row with the lowest FieldID value (1025).
Example : Capturing Output Data as an XML Document
This example produces one row of query capture output and stores it in XML format in the QCD table myqcd.XMLQCD.
INSERT EXPLAIN INTO myqcd IN XML
SELECT *
FROM DBC.DBCInfoV;
The following output shows a fragment of the row inserted into myqcd.XMLQCD table.
Id 1
Kind Q
Seq 1
Length 4501
Text <?xml version="1.0" encoding="UTF-8"?><QCF>
<ElapsedSeconds>0.030<ElapsedSeconds><QueryID>8<QueryID>
<User_Database><UDB_Key>1</UDB_Key><UDB_ID>256</UDB_ID>
<MachineName>localhost</MachineName><UDB_Name>DBC</UDB_Name>
</User_Database><Query><UDB_Key>1</UDB_Key><MachName>localhost
</MachName><NumAMPs>2</NumAMPs><NumPEs>1<NumPEs><NumNodes>1
<NumNodes><ReleaseInfo>13w.00.00.00</ReleaseInfo><VersionInfo>
13w.00.00.00</VersionInfo>…
Example : Capturing Output Data as an XML Document With No Associated DDL Text
This example produces one row of query capture output and stores it in XML format in the QCD table myqcd.XMLQCD. Because you specified the NODDLTEXT option, the system does not store the DDL SQL text related to the request.
INSERT EXPLAIN INTO myqcd IN XML NODDLTEXT
SELECT *
FROM DBC.DBCInfoV;
For More Information
For more information about index analysis, see: