Usage Notes - Teradata Database

SQL Data Manipulation Language

Product
Teradata Database
Release Number
16.10
Published
June 2017
Language
English (United States)
Last Update
2018-04-25
dita:mapPath
psg1480972718197.ditamap
dita:ditavalPath
changebar_rev_16_10_exclude_audience_ie.ditaval
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata® Database

EXPLAIN and the USING Request Modifier

If you specify both an EXPLAIN request modifier and a USING request modifier for the same request, the EXPLAIN modifier must precede the USING modifier. See USING Request Modifier. A dynamic plan cannot be displayed for requests with a USING request modifier.

Capturing EXPLAIN Text Output in XML Format

The IN XML option returns the EXPLAIN text as XML text rather than as plain English text. See SQL Request and Transaction Processing, B035-1142 for the XML plan attributes and the XML schema used by the XML output produced by the IN XML option.

The EXPLAIN IN XML request modifier can also specify that the output be returned in one of two possible formats:
  • As XML text with any associated DDL text. You specify this with the IN XML option.
  • As XML text, but without any associated DDL text. You specify this with the IN XML NODDLTEXT option.

A complementary option for the BEGIN QUERY LOGGING and REPLACE QUERY LOGGING statements optionally creates a compact XML version of the information that can be stored in a DBQL table. See “BEGIN QUERY LOGGING” and “REPLACE QUERY LOGGING” in SQL Data Definition Language - Syntax and Examples, B035-1144.

EXPLAIN Report Overview

EXPLAIN provides a summary of the access and join plans generated by the Optimizer for the query SQL requests. The report details which indexes would be used to process the request, identifies any intermediate spools that would be generated, indicates the types of join to be performed, shows whether the requests in a transaction would be dispatched in parallel, and so on.

When you perform an EXPLAIN against any SQL request, that request is parsed and optimized, and the plan generated by the Optimizer is returned to the requestor in the form of a text file that explains the steps of the plan for the request. The EXPLAIN report reproduces the execution strategy determined by the Optimizer, but does not explain why it makes the choices it does.

EXPLAIN helps you to evaluate complex queries and to develop alternative, more efficient, processing strategies.

Only the first 255 characters of a conditional expression are displayed in an EXPLAIN. The entire conditional expression is enclosed in APOSTROPHE characters.

Although the times reported in the EXPLAIN output are presented in units of seconds, they are actually arbitrary units of time. These numbers are valuable because they permit you to compare alternate coding formulations of the same query with respect to relative performance, but do not correlate with clock time. Times and row estimates are not reported for some steps.

Save the EXPLAIN results for future reference.

Substitute Characters in Object Names and Literals Returned in an EXPLAIN Report

If an object name or literal returned in an EXPLAIN report includes a character not in the the session character set, or is otherwise unprintable, the name or literal is shown as a UNICODE delimited identifier or literal.

Each non-representable character is escaped to its hexadecimal equivalent.

The system uses BACKSLASH (U+005C), if available, as the escape character. If BACKSLASH is not available:
  • If either YEN SIGN (U+00A5) or WON SIGN (U+20A9) replaces BACKSLASH in the session character set, the system uses the replacement as the escape character.
  • If none of the preceding characters is available, the system uses NUMBER SIGN (U+0023).

Object names that are translatable to the session character set, but are not lexically distinguishable as names are enclosed in double quotation marks.

For information on object naming, see SQL Fundamentals, B035-1141.

Standard Form of Display for EXPLAIN

The standard output of an EXPLAIN request modifier displays character constant strings as follows:

Type of Constant Format
Printable single byte character. Teradata Latin.
Anything else. Internal Teradata hexadecimal.

EXPLAIN Processes SQL Requests Only

You can modify any valid Teradata SQL request with EXPLAIN.

You cannot EXPLAIN a USING request modifier, a WITH RECURSIVE view modifier, or another EXPLAIN modifier. See USING Request Modifier and WITH Statement Modifier. You cannot explain individual functions, procedures, or methods.

Teradata Visual Explain Utility

The Teradata Visual Explain tool provides a graphic display of Optimizer plans and also permits you to compare the plans for queries that return the same result. This feature can simplify the interpretation of EXPLAIN reports.

For more information about Teradata Visual Explain, see Teradata Visual Explain User Guide . Related supporting information appears in SQL Request and Transaction Processing, B035-1142.

Dynamic Feedback

A dynamic plan consists of multiple plan fragments. A fragment is considered an intermediate fragment if the fragment is not the last fragment. In an Explain, an intermediate fragment has at least one FEEDBACK RETRIEVE step and completes with an END PLAN FRAGMENT step. A FEEDBACK RETRIEVE step collects the run-time information of a target spool and feeds the information back into the query optimizer. The optimizer uses the feedback information in planning another fragment where the spool is used.

Statistics feedback includes either:
  • Table-level statistics, that is, the number of rows in a target spool and the physical size of the target spool.
  • Table-level statistics and column-level statistics, for example, the number of unique values, the high-mode-frequency, and so on.

Results feedback includes the contents of a target spool and table-level statistics. The table-level statistics are always collected and displayed in Dynamic Explain, regardless of whether the feedback is statistics or results.

Below is an example of statistics feedback where both the table-level and the column-level statistics are collected on Spool 6. The following sentence in Step 8 indicates the column-level statistics collection: Spool 6 sample (50%) statistics go into Spool 19. The column-level statistics are collected respectively by all AMPs, while the rows of Spool 6 are generated. The collected statistics are stored in Spool 19 and read by the step 9 FEEDBACK RETRIEVE. The FEEDBACK RETRIEVE step consolidates the column-level statistics from all AMPs and collects the table-level statistics, as indicated by the statistics keyword. The actual number of rows (45,993 rows) in Spool 6 and the actual physical size (1,655,748 bytes) of Spool 6 are shown in the FEEDBACK RETRIEVE step. The actual numbers in step 9 differ from what the optimizer estimated in step 8 (47,428 rows and 1,185,700 bytes). When a sampling technique is used for the column-level statistics, the maximum sample percentage is described in step 8 and the actual sample percentage is described in the FEEDBACK RETRIEVE step. The maximum sample percentage is fixed for all query requests, but the actual sample percentage is determined dynamically for each spool, based on the characteristics of the underlying data, while column-level statistics are collected. The actual sample percentage is rounded up to the nearest integer.

  8) We do an all-AMPs JOIN step (Global sum) from
     TPCDS.customer_address by way of an all-rows scan with a condition
     of ("NOT (SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 )IS NULL)"),
     which is joined to Spool 7 (Last Use) by way of an all-rows scan.
     TPCDS.customer_address is used as the hash table and Spool 7
     is used as the probe table in a joined using a single partition
     classical hash join, with a join condition of (
     "TPCDS.customer_address.ca_address_sk = Spool_7.c_current_addr_sk").
     The result goes into Spool 6 (all_amps), which is built locally on
     the AMPs with Field1 ("-26834").  Spool 6 sample (50%) statistics
     go into Spool 19.  The size of Spool 6 is estimated with low
     confidence to be 47,428 rows (1,185,700 bytes).
  9) We do an all-AMPs FEEDBACK RETRIEVE step from Spool 19 (Last Use)
     of Spool 6 statistics.  The size is estimated with high confidence
     to be 1 row.  The estimated time for this step is 0.03 seconds.
     The actual size of Spool 6 is 45,993 rows (1,655,748 bytes).
     The statistics are estimated from a sample of 15,849 rows (35% of
     the total rows).
  10) We send an END PLAN FRAGMENT step for plan fragment 1. 

Below is an example of results feedback, where the contents of Spool 2 and table-level statistics are collected. The FEEDBACK RETRIEVE step does not include the statistics keyword, indicating results feedback. The actual data of Spool 2 is used to rewrite the query. Step 23 shows that the feedback data is used as a join condition. For security purposes, the actual value is masked with the special symbol (":*").

  17) We do an all-AMPs FEEDBACK RETRIEVE step from Spool 2.  The size
     is estimated with no confidence to be 2 rows.  The estimated time
     for this step is 0.03 seconds.  The actual size of Spool 2 is 9
     rows (2,048 bytes).
  18) We send an END PLAN FRAGMENT step for plan fragment 1, and we
     release Spool 2 (Last Use).
...
  23) We do an all-AMPs JOIN step from TPCDS.customer_address by way of
     an all-rows scan with no residual conditions, which is joined to
     Spool 8 (Last Use) by way of an all-rows scan.
     TPCDS.customer_address and Spool 8 are joined using a single
     partition hash join, with a join condition of ("(((i_item_id =
     :*) OR ((i_item_id = :*) OR ((i_item_id = :*) OR ((i_item_id =
     :*) OR ((i_item_id = :*) OR ((i_item_id = :*) OR ((i_item_id =
     :*) OR ((i_item_id = :*) OR (i_item_id = :*))))))))) OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '80348') OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '81792') OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '83405') OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '85392') OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '85460') OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '85669') OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '86197') OR
     (((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '86475') OR
     ((SUBSTR(TPCDS.customer_address.ca_zip ,1 ,5 ))= '88274'))))))))))
     AND (c_current_addr_sk = TPCDS.customer_address.ca_address_sk)").
     The result goes into Spool 3 (all_amps), which is built locally on
     the AMPs.  Spool 3 sample (50%) statistics go into Spool 17.  The
     size of Spool 3 is estimated with low confidence to be 35,873 rows
     (1,829,523 bytes).

EXPLAIN and Embedded SQL

Use EXPLAIN only with data returning requests. EXPLAIN returns multiple rows of a single column whose data type is VARCHAR(80). Because it causes multiple rows to be returned, EXPLAIN can only be performed with a cursor.

For static execution of EXPLAIN, see SQL Stored Procedures and Embedded SQL , B035-1148 , substituting an EXPLAIN-modified SQL request for the query_expression clause shown there.

Do not perform EXPLAIN request modifiers dynamically because the result is unpredictable.

EXPLAIN and Procedures

You cannot EXPLAIN procedures.

For example, if you compile a procedure under the name update_orders and then perform the following EXPLAIN, an error occurs because update_orders is not a valid SQL statement.

     EXPLAIN update_orders;

You should use EXPLAIN while building your procedures. You must extract the SQL text from the procedure body and EXPLAIN the individual statements by themselves. This might require modification of the procedure SQL text in some cases. For example, you might have to remove an INTO clause or add a USING request modifier to represent procedure variables and parameters.

Effect of Request Cache Peeking on EXPLAIN Reports

If you specify USING data, or if you specify a DATE or CURRENT_DATE built-in function in a request, or both, the system can invoke Request Cache peeking. See . In this case, the EXPLAIN text indicates the peeked literal values.

If you do not specify USING data, or if the USING variables, DATE or CURRENT_DATE values, or both are not peeked, then there is no impact on either the generated plan or the generated EXPLAIN text.

Note that parameterized requests specified without a USING request modifier, but using either CLIv2 data parcel flavor 3 (Data) or CLIv2 parcel flavor 71 (DataInfo), cannot be explained using any of the following request modifiers or statements:
  • EXPLAIN
  • DUMP EXPLAIN
  • INSERT EXPLAIN

For details about data parcel formats, see Teradata Call-Level Interface Version 2 Reference for Mainframe-Attached Systems, B035-2417 or Teradata Call-Level Interface Version 2 Reference for Workstation-Attached Systems, B035-2418.

Using EXPLAIN to Determine the Database Objects that a View Accesses

Explaining a request does not necessarily report the names of all the underlying database objects accessed by that request, but it does provide the names of all the base tables accessed.

To determine the objects that a particular view accesses, including any nested views, use the SHOW statement. See “SHOW” in SQL Data Definition Language - Syntax and Examples, B035-1144.

For example, the following request reports the create text and containing databases for all underlying tables and views accessed by view_name.

     SHOW QUALIFIED SELECT *
     FROM view_name;

On the other hand, the following request reports only the underlying tables accessed by view_name, and not any nested views that are accessed.

     EXPLAIN SELECT *
     FROM view_name;

2PC Session Mode

If you specify an EXPLAIN modifier with a request in 2PC session mode, and that request is followed by a 2PC function, then the SQL portion of the request is explained, but the 2PC function is not.

For example, if the following multistatement request were submitted, the INSERT and SELECT would be explained; the VOTE would not be explained. The VOTE is, however, sent to the AMPs:

     EXPLAIN INSERT tab1(1,2)
            ;SELECT *
             FROM tab1
            ;VOTE

VOTE is specified by the system administrator, not by users submitting requests in 2PC mode.