15.10 - EXPLAIN Request Modifier and Join Processing - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
15.10
category
Programming Reference
User Guide
featnum
B035-1142-151K

The following descriptions are generated by EXPLAIN when applied to sample join requests.

Each explanation is preceded by the request syntax and is followed by a listing of any new terminology found in the display.

This request has a WHERE condition based on the value of a unique primary index, which produces efficient processing even though a product join is used (compare with the merge join used to process the request in “Merge Join Example” on page 583).

     EXPLAIN SELECT Hours, EmpNo, Description
             FROM Charges, Project
             WHERE Charges.Proj_Id = 'ENG-0003'
             AND Project.Proj_Id = 'ENG-0003'    
             AND Charges.WkEnd > Project.DueDate ;
 

This request returns the following EXPLAIN report:

Explanation 
--------------------------------------------------------
1) First, we lock PERSONNEL.Charges for read.
2) Next, we do a single-AMP RETRIEVE step from PERSONNEL.Project by way of the unique primary index "PERSONNEL.Project.Proj_Id = ’ENG-003’" with no residual conditions into Spool 2, which is duplicated on all AMPs. The size of Spool 2 is estimated to be 4 rows. The estimated time for this step is 0.07 seconds.
3) We do an all AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to PERSONNEL.Charges with a condition of ("PERSONNEL.Charges.Proj_Id = ’ENG-0003’"). Spool 2 and PERSONNEL.Charges are joined using a product join, with a join condition of ("PERSONNEL.Charges.WkEnd > Spool_2.DueDate"). The result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 6 rows. The estimated time for this step is 0.13 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are back to the user as the result of statement 1. The total estimated time is 0.20 seconds.

New terminology in this explanation is defined as follows:

 

                          Phrase

                                            Definition

single-AMP RETRIEVE step from … by way of the unique primary index

A single row of a table is selected using a unique primary index that hashes to the single AMP on which the row is stored.

duplicated on all AMPs

The contents of a spool file, selected from the first table involved in the join, is replicated on all the AMPs that contain another table involved in the join.

all-AMPs JOIN step … by way of an all-rows scan

Table rows are searched row by row on each AMP on which they reside. Rows that satisfy the join condition are joined to the spooled row or rows.

condition of …

An intermediate condition used to qualify the joining of selected rows with spooled rows (as compared with an overall join condition).

product join

One of the types of join processing performed by Teradata Database.

This request returns the following EXPLAIN report:

     EXPLAIN SELECT Name, DeptName, Loc
             FROM Employee, Department
             WHERE Employee.DeptNo = Department.DeptNo ;
 
Explanation 
--------------------------------------------------------
1) First, we lock PERSONNEL.Department for read, and we lock PERSONNEL.Employee for read.
2) Next, we do an all-AMPs RETRIEVE step from PERSONNEL.Employee by way of an all-rows scan with no residual conditions into Spool 2, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash. The size of Spool 2 is estimated to be 8 rows. The estimated time for this step is 0.10 seconds.
3) We do an all-AMPs JOIN step from PERSONNEL.Department by way of a RowHash match scan with no residual conditions, which is joined to Spool 2 (Last Use).  PERSONNEL.Department and Spool 2 are joined using a merge join, with a join condition of ("Spool_2.DeptNo = PERSONNEL.Department.DeptNo"). The  result goes into Spool 1, which is built locally on the AMPs. The size of Spool 1 is estimated to be 8 rows. The estimated time for this step is 0.11 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.22 seconds.

New terminology in this explanation is defined as follows.

 

         Phrase

                                                        Definition

merge join

One of the join methods used by Teradata Database.

This request returns the following EXPLAIN report:

     EXPLAIN SELECT Employee.EmpNum,Department.DeptName, Employee.Salary
             FROM Employee, Department
             WHERE Employee.Location = Department.Location ;
   
     ***Help information returned. 30 rows.
     ***Total elapsed time was 1 second.
 
Explanation 
-------------------------------------------
 
1) First, we lock a distinct PERSONNEL."pseudo table" for read on a RowHash to prevent global deadlock for PERSONNEL.Department.
2) Next, we lock a distinct PERSONNEL."pseudo table" for read on a RowHash to prevent global deadlock for PERSONNEL.Employee.
3) We lock PERSONNEL.Department for read, and we lock PERSONNEL.Employee for read.
4) We do an all-AMPs RETRIEVE step from PERSONNEL.Employee by way of an all-rows scan with no residual conditions into Spool 2 fanned out into 22 hash join partitions, which is redistributed by hash code to all AMPs. The size of Spool 2 is estimated to be 3,995,664 rows. The estimated time for this step is 3 minutes and 54 seconds.
5) We do an all-AMPs RETRIEVE step from PERSONNEL.Department by way of an all-rows scan with no residual conditions into Spool 3 fanned out into 22 hash join partitions, which is redistributed by hash code to all AMPs. The size of Spool 3 is estimated to be 4,000,256 rows. The estimated time for this step is 3 minutes and 54 seconds.
6) We do an all-AMPs JOIN step from Spool 2 (Last Use) by way of an all-rows scan, which is joined to Spool 3 (Last Use). Spool 2 and Spool 3 are joined using a hash join of 22 partitions, with a join condition of ("Spool_2.Location = Spool_3.Location"). The result goes into Spool 1, which is built locally on the AMPs. The result spool field will not be cached in memory. The size of Spool 1 is estimated to be 1,997,895,930 rows. The estimated time for this step is 4 hours and 42 minutes.
7) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 4 hours and 49 minutes.
 
DBS Control Record - Performance Fields:
 
HTMemAlloc        = 5%
SkewAllowance     = 75%

New terminology in this explanation is defined as follows:

 

           Phrase

                                                      Definition

hash join

One of the types of join processing performed by Teradata Database.

This request returns the following EXPLAIN report:

     EXPLAIN SELECT dept_name, name, yrs_exp
             FROM employee, department
             WHERE employee.emp_no  = department.mgr_no
             AND   department.dept_no = 100;
 
Explanation 
-------------------------------------------------------
 
1) First, we do a single AMP JOIN step from PERSONNEL.Department by way of the unique primary index "PERSONNEL.Department.Dept_No = 100" with no residual condition which is joined to PERSONNEL.Employee by way of the unique primary index "PERSONNEL.Employee.Emp_No = PERSONNEL.Department.Mgr_No". PERSONNEL.Department and PERSONNEL.Employee are joined using a nested join. The result goes into Spool 1, which is built locally on that AMP. The size of Spool 1 is estimated to be 1 rows. The estimated time for this step is 0.10 seconds.
 
> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.10 seconds.

New terminology in this explanation is defined as follows:

 

                   Phrase

                                            Definition

single-AMP JOIN step from … by way of the unique primary index

A single row on one AMP is selected using the value of a unique index defined for its table.

Using the value in that row which hashes to a unique index value in a row of a second table on another AMP, the first row is joined with the second row. (Note that when a table is accessed by its unique primary index, the need for a rowhash lock on the table is implied, even though it is not explicitly stated in the Explain text.)

nested join

One of the types of join processing performed by the Teradata Database.

The request in this example selects columns only from the primary table.

If an additional column was selected from the table being joined via the subquery (for example, if the request was SELECT name, dept_no, loc FROM employee, department), the result would be a Cartesian product.

     EXPLAIN SELECT name, dept_no 
             FROM employee
             WHERE dept_no NOT IN (SELECT dept_no
                                   FROM department
                                   WHERE loc = 'CHI')
             ORDER BY name;

This request returns the following EXPLAIN report:

Explanation 
--------------------------------------------------------
1) First, we lock PERSONNEL.Department for read, and we lock PERSONNEL.Employee for read.
2) Next, we execute the following steps in parallel. 
   1) We do an all-AMPs RETRIEVE step from PERSONNEL.Department by way of an all-rows scan with a condition of ("PERSONNEL.Department.Loc = ’CHI’") into Spool 2, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 2 by row hash and the sort key in spool field1 eliminating duplicate rows. The size of Spool 2 is estimated to be 4 rows. The estimated time for this step is 0.07 seconds.
   2) We do an all-AMPs RETRIEVE step from PERSONNEL.Employee by way of an all-rows scan with no residual conditions into Spool 3, which is redistributed by hash code to all AMPs. Then we do a SORT to order Spool 3 by row hash. The size of Spool 3 is estimated to be 8 rows. The estimated time for this step is 0.10 seconds.
3) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of an all-rows scan, which is joined to Spool 2 (Last Use). Spool 3 and Spool 2 are joined using an exclusion merge join, with a join condition of ("Spool_3.Dept_No = Spool_2.Dept_No"). The result goes into Spool 1, which is built locally on the AMPs. Then we do a SORT to order Spool 1 by the sort key in spool field1. The size of Spool 1 is estimated to be 8 rows. The estimated time for this step is 0.13 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.23 seconds.

New terminology in this explanation is defined as follows:

 

               Phrase

                                                       Definition

SORT to order Spool 2 by row hash and the sort key in spool field1 eliminating duplicate rows...

Rows in the spool are sorted by hash code using a uniqueness sort to eliminate duplicate rows. Uniqueness is based on the data in spool field1.

The contents of spool field1 depend on the query and might comprise any of the following:

  • A concatenation of all the fields in the spool row (used for queries with SELECT DISTINCT or that involve a UNION, INTERSECT, or MINUS operation).
  • A concatenation of the rowIDs that identify the data rows from which the spool row was formed (used for complex queries involving subqueries).
  • Some other value that uniquely defines the spool row (used for complex queries involving aggregates and subqueries).
  • SORT to order Spool 1 by the sort key in spool field1

    This last sort is in response to the “ORDER BY” clause attached to the primary SELECT request.

    exclusion merge join

    One of the types of join processing performed by Teradata Database.