16.10 - EXPLAIN Request Modifier: Examples of Complex Queries - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
16.10
created_date
June 2017
category
Programming Reference
User Guide
featnum
B035-1142-161K

This section examines the usefulness of the EXPLAIN request modifier in different situations.

You should always use EXPLAIN reports to analyze any new queries under development. Subtle differences in the way a query is structured can produce enormous differences in its resource impact and performance while at the same time returning the identical result set.

EXPLAIN request modifier terms are defined in EXPLAIN Request Modifier.

In these examples, the personnel.employee table has a unique primary index defined on the emp_no column and a nonunique secondary index defined on the name column.

SELECT Example

The EXPLAIN request modifier generates the following response for this request:

     EXPLAIN SELECT name, dept_no
             FROM employee
             WHERE empno = 10009;
Explanation
-----------------------------------------------------
1) First, we do a single-AMP RETRIEVE step from Personnel.Employee by way of the unique primary index "PERSONNEL.Employee.EmpNo = 10009" with no residual conditions. The estimated time for this step is 0.04 seconds.
-> The row is sent directly back to the user as the result of statement 1. The total estimated time is 0.04 seconds.

SELECT Example With a Condition on a Nonunique Index

The WHERE condition in this example is based on a column that is defined as a nonunique index. Note that the system places a READ lock on the table.

The EXPLAIN request modifier generates the following response for this request:

     EXPLAIN SELECT emp_no, dept_no
             FROM employee
             WHERE name = 'Smith T';
   
Explanation
-----------------------------------------------------------------------
1) First, we lock a distinct PERSONNEL."pseudo table" for read on a
   RowHash to prevent global deadlock for PERSONNEL.employee.
2) Next, we lock PERSONNEL.employee for read.
3) We do an all-AMPs RETRIEVE step from PERSONNEL.employee by way of
   index # 4 "PERSONNEL.employee.Name = 'Smith T '" with no residual
   conditions into Spool 1 (group_amps), which is built locally on
   the AMPs. The size of Spool 1 is estimated with low confidence to
   be 33,563 rows (1,443,209 bytes). The estimated time for this
   step is 0.34 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
   in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
   statement 1. The total estimated time is 0.34 seconds.

SELECT Example With a Condition on a Unique Secondary Index

Assume that the employee table has another column (socsecno), where soc_sec_no is defined as a unique secondary index.

If the WHERE condition is based on this column, then the EXPLAIN request modifier generates the following response for this request:

     EXPLAIN SELECT name, emp_no
             FROM employee
             WHERE soc_sec_no = '123456789';
Explanation
-----------------------------------------------------
1) First, we do a two-AMP RETRIEVE step from PERSONNEL.Employee by way
   of unique index # 20 "PERSONNEL.Employee.socSecNo = 123456789" with 
   no residual conditions. The estimated time for this step 
   is 0.09 seconds.
-> The row is sent directly back to the user as the result of statement 1. The total    
   estimated time is 0.09 seconds.

SELECT Example With a Condition Based on a Join

In this example, the WHERE clause defines an equality constraint that governs a join.

The rows of the department table are copied to a spool for use in the join operation.

The EXPLAIN request modifier generates the following response for this request:

     EXPLAIN SELECT dept_name, name
             FROM employee, department
             WHERE employee.emp_no = department.mgr_no;
Explanation
--------------------------------------------------------------------
1) First, we lock a distinct PERSONNEL."pseudo table" for read on a 
   RowHash to prevent global deadlock for PERSONNEL.department.
2) Next, we lock a distinct PERSONNEL."pseudo table" for read on a 
   RowHash to prevent global deadlock for PERSONNEL.employee.
3) We lock PERSONNEL.department for read, and we lock 
   PERSONNEL.employee for read.
4) We do an all-AMPs RETRIEVE step from PERSONNEL.department by way of 
   an all-rows scan with no residual conditions into Spool 2, which is
   redistributed by hash code to all AMPs. Then we do a SORT to order 
   Spool 2 by row hash. The size of Spool 2 is estimated to be 8 rows.
   The estimated time for this step is 0.11 seconds.
5) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of a
   RowHash match scan, which is joined to PERSONNEL.employee. 
   Spool 2 and PERSONNEL.employee are joined using a merge join, 
   with a join condition of 
  ("PERSONNEL.employee.EmpNo = Spool_2.MgrNo").
   The result goes into Spool 1, which is built locally on the AMPs.
   The size of Spool 1 is estimated to be 8 rows. The estimated time
   for this step is 0.07 seconds.
-> The contents of Spool 1 are sent back to the user as the result of
   statement 1. The total estimated time is 0 hours and 0.19 seconds.

SELECT Example With a Condition Based on a Subquery

In this example, the constraint that governs the join is defined by a subquery predicate and the ORDER BY clause specifies a sorted result.

The EXPLAIN request modifier generates the following response for this request:

     EXPLAIN SELECT name, emp_no
             FROM employee
             WHERE emp_no IN (SELECT emp_no
                              FROM charges)
             ORDER BY name;
Explanation
--------------------------------------------------------------------
1) First, we lock a distinct PERSONNEL."pseudo table" for read on a
   RowHash to prevent global deadlock for PERSONNEL.charges.
2) Next, we lock a distinct PERSONNEL."pseudo table" for read on a
   RowHash to prevent global deadlock for PERSONNEL.employee.
3) We lock PERSONNEL.charges for read, and we lock PERSONNEL.employee
   for read.
4) We do an all-AMPs RETRIEVE step from PERSONNEL.charges by way of an
   all-rows scan with no residual conditions into Spool 2, which is
   redistributed by hash code to all AMPs. Then we do a SORT to order
   Spool 2 by row hash and the sort key in spool field1 eliminating
   duplicate rows. The size of Spool 2 is estimated to be 12 rows.
   The estimated time for this step is 0.15 seconds.
5) We do an all-AMPs JOIN step from PERSONNEL.employee by way of an
   all-rows scan with no residual conditions, which is joined to
   Spool 2 (Last Use). PERSONNEL.employee and Spool 2 are joined using
   an inclusion merge join, with a join condition of
   ("PERSONNEL.employee.EmpNo = Spool_2.EmpNo"). The result goes into
   Spool 1, which is built locally on the AMPs. Then we do a SORT to
   order Spool 1 by the sort key in spool field1. The size of Spool 1
   is estimated to be 12 rows. The estimated time for this step is 
   0.07 seconds.
-> The contents of Spool 1 are sent back to the user as the result of 
   statement 1. The total estimated time is 0 hours and 0.23 seconds.

Recursive SELECT Example

The following example demonstrates a recursive query:

     EXPLAIN WITH RECURSIVE temp_table (employee_number, depth) AS
     ( SELECT root.employee_number, 0 as depth
       FROM Employee root
       WHERE root.manager_employee_number = 801
     UNION ALL
       SELECT indirect.employee_number, seed.depth+1 as depth
       FROM temp_table seed, Employee indirect
       WHERE seed.employee_number = indirect.manager_employee_number
       AND depth <= 20
     )
     SELECT employee_number, depth FROM temp_table;

EXPLAIN generates the following report for this request:

Explanation
-----------------------------------------------------------------------
  1) First, we lock a distinct PERSONNEL."pseudo table" for
     read on a RowHash to prevent global deadlock for PERSONNEL.root.
  2) Next, we lock PERSONNEL.root for read.
  3) We do an all-AMPs RETRIEVE step from PERSONNEL.root by
     way of an all-rows scan with a condition of (
     "PERSONNEL.root.manager_employee_number = 801") into
     Spool 3 (all_amps), which is built locally on the AMPs. The size
     of Spool 3 is estimated with no confidence to be 1 row. The
     estimated time for this step is 0.06 seconds.
  4) We do an all-AMPs RETRIEVE step from Spool 3 by way of an all-rows
     scan into Spool 2 (all_amps), which is built locally on the AMPs.
     The size of Spool 2 is estimated with no confidence to be 1 row.
     The estimated time for this step is 0.07 seconds.
  5) We execute the following steps in parallel.
       1) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by
          way of an all-rows scan with a condition of ("(DEPTH <= 20)
          AND (NOT (EMPLOYEE_NUMBER IS NULL ))") into Spool 4
          (all_amps), which is duplicated on all AMPs. The size of
          Spool 4 is estimated with no confidence to be 8 rows. The
          estimated time for this step is 0.03 seconds.
       2) We do an all-AMPs RETRIEVE step from
          PERSONNEL.indirect by way of an all-rows scan with a
          condition of
          ("NOT (PERSONNEL.indirect.manager_employee_number IS NULL)")
          into Spool 5 (all_amps), which is built locally on the AMPs.
          The size of Spool 5 is estimated with no confidence to be 8
          rows. The estimated time for this step is 0.01 seconds.
  6) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an
     all-rows scan, which is joined to Spool 5 (Last Use) by way of an
     all-rows scan. Spool 4 and Spool 5 are joined using a single
     partition hash join, with a join condition of ("EMPLOYEE_NUMBER =
     manager_employee_number"). The result goes into Spool 6
     (all_amps), which is built locally on the AMPs. The size of Spool
     6 is estimated with no confidence to be 3 rows. The estimated
     time for this step is 0.08 seconds.
  7) We do an all-AMPs RETRIEVE step from Spool 6 (Last Use) by way of
     an all-rows scan into Spool 3 (all_amps), which is built locally
     on the AMPs. The size of Spool 3 is estimated with no confidence
     to be 4 rows. The estimated time for this step is 0.07 seconds.
     If one or more rows are inserted into spool 3, then go to step 4.
  8) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of
     an all-rows scan into Spool 7 (all_amps), which is built locally
     on the AMPs. The size of Spool 7 is estimated with no confidence
     to be 142 rows. The estimated time for this step is 0.07 seconds.
  9) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 7 are sent back to the user as the result of
     statement 1. The total estimated time is 0.38 seconds.

Step 3 indicates the processing of the seed statement inside the recursive query and produces the initial temporary result set.

Steps 4 through 7 correspond to the processing of the recursive statement inside the recursive query and repeat until no new rows are inserted into the temporary result set. Although steps 4 through 7 indicate a static plan, each iteration can produce spools with varying cardinalities; thus, the level of confidence for the spool size in these steps is set to no confidence.

Step 8 indicates the processing of the final result that is sent back to the user.

Large Table SELECT Example With a Complex Condition

Assume that a table named main is very large and that its columns named numa, numb, kind, and event are each defined as nonunique secondary indexes.

The request in this example uses these indexes to apply a complex conditional expression.

The EXPLAIN request modifier generates the following response for this request:

     EXPLAIN SELECT COUNT(*)
             FROM main
             WHERE numa = '101'
             AND   numb = '02'
             AND   kind = 'B'
             AND   event = '001';

The response indicates that bit mapping would be used.

Explanation
---------------------------------------------------------
1) First, we lock TESTING.Main for read.
2) Next, we do a BMSMS (bit map set manipulation) step
   that intersects the following row id bit maps:
   1) The bit map built for TESTING.Main by way of index # 12
      "TESTING.Main.Kind = ’B’".
   2) The bit map build for TESTING.Main by way of index # 8
      "TESTING.Main.NumB = ’02’".
   3) The bit map built for TESTING.Main by way of index # 16
      "TESTING.Main.Event = ’001’".
   The resulting bit map is placed in Spool 3. The estimated time
   for this step is 17.77 seconds.
3) We do a SUM step to aggregate from TESTING.Main by way of index # 4
   "TESTING.Main.NumA = ’101’" and the bit map in Spool 3 (Last Use)
   with a residual condition of ("(TESTING.Main.NumB = ’02’)
   and ((TESTING.Main.Kind = ’B’) and (TESTING.Main.Event = ’001’))").
   Aggregate Intermediate Results are computed globally, then
   placed in Spool 2.
4) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of an
   all-rows scan into Spool 1, which is built locally on the AMPs. The
   size of Spool 1 is estimated to be 20 rows. The estimated time for
   this step is 0.11 seconds.
5) Finally, we send out an END TRANSACTION step to all AMPs involved in
   processing the request.
-> The contents of Spool 1 are sent back to the user as a result of statement 1.

Implicit Multistatement INSERT Example

In the following BTEQ multistatement request, which is treated as an implicit transaction, the statements are processed concurrently.

In Teradata session mode, the EXPLAIN request modifier generates the following response for this request:

     EXPLAIN INSERT Charges (30001, 'AP2-0004', 890825, 45.0)
           ; INSERT Charges (30002, 'AP2-0004', 890721, 12.0)
           ; INSERT Charges (30003, 'AP2-0004', 890811, 2.5)
           ; INSERT Charges (30004, 'AP2-0004', 890831, 37.5
           ; INSERT Charges (30005, 'AP2-0004', 890825, 11.0)
           ; INSERT Charges (30006, 'AP2-0004', 890721, 24.5)
           ; INSERT Charges (30007, 'AP2-0004', 890811, 40.5)
           ; INSERT Charges (30008, 'AP2-0004', 890831, 32.0
           ; INSERT Charges (30009, 'AP2-0004', 890825, 41.5)
           ; INSERT Charges (30010, 'AP2-0004', 890721, 22.0) ;
Explanation
---------------------------------------------------------
1) First, we execute the following steps in parallel.
    1)    We do an INSERT into PERSONNEL.charges.
    2)    We do an INSERT into PERSONNEL.charges.
    3)    We do an INSERT into PERSONNEL.charges.
    4)    We do an INSERT into PERSONNEL.charges.
    5)    We do an INSERT into PERSONNEL.charges.
    6)    We do an INSERT into PERSONNEL.charges.
    7)    We do an INSERT into PERSONNEL.charges.
    8)    We do an INSERT into PERSONNEL.charges.
    9)    We do an INSERT into PERSONNEL.charges.
    10)   We do an INSERT into PERSONNEL.charges.
2) Finally, we send out an END TRANSACTION step to all
   AMPs involved in processing the request.
-> No rows are returned to the user as the result of
     statement 1.
     No rows are returned to the user as the result of
     statement 2.
     No rows are returned to the user as the result of
     statement 3.
     No rows are returned to the user as the result of
     statement 4.
     No rows are returned to the user as the result of
     statement 5.
     No rows are returned to the user as the result of
     statement 6.
     No rows are returned to the user as the result of
     statement 7.
     No rows are returned to the user as the result of
     statement 8.
     No rows are returned to the user as the result of
     statement 9.
     No rows are returned to the user as the result of
     statement 10.

ANSI Versus Teradata Session Mode Update Example

This example shows the EXPLAIN differences between running the session in ANSI versus Teradata session modes:

   EXPLAIN UPDATE Employee
           SET deptno = 650
           WHERE deptno = 640;

In ANSI mode, EXPLAIN generates the following response for this request:

Explanation
--------------------------------------------------------------------
1) First, we lock a distinct PERSONNEL."pseudo table" for write
   on a RowHash to prevent global deadlock for PERSONNEL.employee.
2) Next, we lock PERSONNEL.employee for write.
3) We do an all-AMPs UPDATE from PERSONNEL.employee by way of an
   all-rows scan with a condition of
   ("PERSONNEL.employee.DeptNo = 640").
-> No rows are returned to the user as the result of statement 1.

In Teradata session mode, EXPLAIN generates this response for the same request:

Explanation
--------------------------------------------------------------------
1) First, we lock a distinct PERSONNEL."pseudo table" for write on a
   RowHash to prevent global deadlock for PERSONNEL.employee.
2) Next, we lock PERSONNEL.employee for write.
3) We do an all-AMPs UPDATE from PERSONNEL.employee by way of an
   all-rows scan with a condition of
   ("PERSONNEL.employee.DeptNo = 640").
4) Finally, we send out an END TRANSACTION step to all AMPs involved
   in processing the request.
-> No rows are returned to the user as the result of statement 1.

In ANSI session mode the transaction is not committed, therefore it is not ended, whereas in Teradata session mode, no COMMIT is required to end the transaction.

Row Trigger With Looping Trigger Action Example

In this example three tables t1, t2 and t3, and an AFTER row trigger with t1 as the subject table, are created. The trigger action modifies tables t2 and t3.

The EXPLAIN text for the INSERT operation, which is part of the trigger action, specifically marks the beginning and ending of the row trigger loop. The relevant phrases in the EXPLAIN report are highlighted in boldface type:

The DDL statements for creating the tables and the trigger are as follows:

     CREATE TABLE t1(
       i INTEGER,
       j INTEGER);

     CREATE TABLE t2(
       i INTEGER,
       j INTEGER);

     CREATE TABLE t3(
       i INTEGER,
       j INTEGER);

     CREATE TRIGGER g1 AFTER INSERT ON t1
     FOR EACH ROW
     (
     UPDATE t2 SET j = j+1;
     DELETE t2;
     DELETE t3;
     );

The EXPLAIN text reports the steps used to process the following INSERT … SELECT statement:

     EXPLAIN INSERT t1 SELECT * FROM t3;
      *** Help information returned. 50 rows.
      *** Total elapsed time was 1 second.
Explanation
----------------------------------------------------------------
1)  First, we lock a distinct EXP_TST1."pseudo table" for write on a
    RowHash to prevent global deadlock for EXP_TST1.t3.
2)  Next, we lock a distinct EXP_TST1."pseudo table" for write on a
    RowHash to prevent global deadlock for EXP_TST1.t2.
3)  We lock a distinct EXP_TST1."pseudo table" for write on a RowHash
    to prevent global deadlock for EXP_TST1.t1.
4)  We lock EXP_TST1.t3 for write, we lock EXP_TST1.t2 for write,
    and we lock EXP_TST1.t1 for write.
5)  We do an all-AMPs RETRIEVE step from EXP_TST1.t1 by way of an
    all-rows scan with no residual conditions into Spool 3 (all_amps),
    which is redistributed by hash code to all AMPs. Then we do a SORT
    to order Spool 3 by the sort key in spool field1 eliminating
    duplicate rows. The size of Spool 3 is estimated with low
    confidence to be 2 rows. The estimated time for this step
    is 0.03 seconds.
6)  We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
    an all-rows scan into Spool 2 (all_amps), which is duplicated on
    all AMPs. The size of Spool 2 is estimated with no confidence to be
    4 rows.
7)  We do an all-AMPs JOIN step from EXP_TST1.t3 by way of an all-rows
    scan with no residual conditions, which is joined to Spool 2
    (Last Use). EXP_TST1.t3 and Spool 2 are joined using an exclusion
    product join, with a join condition of
    ("((j =    EXP_TST1.t3.j) OR (j IS NULL)) AND
    (((j = EXP_TST1.t3.j) OR (EXP_TST1.t3.j IS NULL)) AND
    (((i = EXP_TST1.t3.i) OR (EXP_TST1.t3.i IS NULL)) AND
    ((i =    EXP_TST1.t3.i) OR (i IS NULL))))") where unknown
    comparison will be ignored. The result goes into Spool 1
    (all_amps), which is built locally on the AMPs. The size
    of Spool 1 is estimated with index join confidence to be
    2 rows. The estimated time for this step is
    0.03 seconds.
8)  We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
    an all-rows scan into Spool 4 (all_amps), which is redistributed by
    hash code to all AMPs. Then we do a SORT to order Spool 4 by row
    hash. The size of Spool 4 is estimated with index join confidence
    to be 2 rows. The estimated time for this step is 0.04 seconds.
9)  We do an all-AMPs MERGE into EXP_TST1.t1 from Spool 4 (Last Use).
10)  <BEGIN ROW TRIGGER LOOP> 
    we do an all-AMPs UPDATE from EXP_TST1.t2 by way of an all-rows
    scan with no residual conditions.
11) We do an all-AMPs DELETE from EXP_TST1.t2 by way of an all-rows
    scan with no residual conditions.
12) We do an all-AMPs DELETE from EXP_TST1.t3 by way of an all-rows
    scan with no residual conditions.
    <END ROW TRIGGER LOOP> for step 10. 
13) We spoil the parser's dictionary cache for the table.
14) We spoil the parser's dictionary cache for the table.
15) Finally, we send out an END TRANSACTION step to all AMPs involved
    in processing the request.
 -> No rows are returned to the user as the result of statement 1.

Few-AMPs Row Redistribution Example

Assume that you create a multiset table and a second table, as follows:

Create Multiset table
       customer_order (customer_name char(10),
                       order_no integer)
                       Primary Index (customer_name) ;
Create table order_detail (order_no integer,
                           product_name  char(20))
                           Primary Index (order_no) ;

The EXPLAIN request modifier generates the few AMPs text in step 1, showing that the step uses few AMPs for row redistribution. Query step 1 uses few AMPs because of the following factors:

  • the step is initiated in a single AMP that has rows with ‘Smith’ (NUPI), AND
  • the Smith rows are redistributed by the hash code of the order_no column, AND
  • the number of the redistributed rows is small and the rows are hashed to only few AMPs.
    EXPLAIN		SELECT customer_name, product_name
    		FROM customer_order CO, order_detail OD
    		WHERE CO.customer_name = 'Smith' AND
    		CO.order_no = OD.order_no  ;
    *** Help information returned. 20 rows.
    *** Total elapsed time was 1 second.
    Explanation
    --------------------------------------------------------------------
      1) First, we do a single-AMP RETRIEVE step from JW.CO by way of
         the primary index "JW.CO.customer_name = 'Smith '" with a
         residual condition of ("NOT (JW.CO.order_no IS NULL)") into 
         Spool 2 (group_amps), which is redistributed by the hash code
         of (JW.CO.order_no) to  few AMPs. Then we do a SORT to
         order Spool 2 by row hash. The size of Spool 2 is estimated
         with low confidence to be 2 rows (54 bytes). The estimated
         time for this step is 0.01 seconds.
      2) Next, we do a group-AMPs JOIN step from JW.OD by way of a
         RowHash match scan, which is joined to Spool 2 (Last Use) by
         way of a RowHash match scan. JW.OD and Spool 2 are joined using
         a merge join, with a join condition of
         ("order_no = JW.OD.order_no"). The result goes into Spool 1
         (group_amps), which is built locally on that AMP. The size of
         Spool 1 is estimated with index join confidence to be 3 rows
         (153 bytes). The estimated time for this step is 0.11 seconds.
      3) Finally, we send out an END TRANSACTION step to all AMPs
         involved in processing the request.
      -> The contents of Spool 1 are sent back to the user as the
         result of statement 1. The total estimated time is 0.11 seconds.