15.10 - General Method of Maintaining a Join Index During INSERT Operations - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
15.10
category
Programming Reference
User Guide
featnum
B035-1142-151K

This topic describes a general method by which Teradata Database maintains a join index during insert operations.

The following is an example of maintaining join index during an INSERT request.

Note the following items in the EXPLAIN output:

 

This step …

Does this …

             9

Produces the new join result rows for the join index.

           11

Deletes any formerly unmatched outer join rows in the join index.

           12

Inserts the new join result rows into the join index.

     EXPLAIN 
     INSERT ordertbl
       SELECT * 
       FROM neworders;
   
      *** Help information returned. 46 rows. 
      *** Total elapsed time was 1 second.
   
Explanation
-----------------------------------------------------------------------
  1) First, we lock a distinct df2."pseudo table" for read on a
     RowHash to prevent global deadlock for df2.neworders.
  2) Next, we lock a distinct df2."pseudo table" for write on a
     RowHash to prevent global deadlock for df2.orderjoinline.
  3) We lock a distinct df2."pseudo table" for write on a RowHash to
     prevent global deadlock for df2.ordertbl.
  4) We lock a distinct df2."pseudo table" for read on a RowHash to
     prevent global deadlock for df2.lineitem.
  5) We lock df2.neworders for read, we lock df2.OrderJoinLine
     for write, we lock df2.ordertbl for write, and we lock
     df2.lineitem for read.
  6) We do an all-AMPs RETRIEVE step from df2.neworders by way of an
     all-rows scan with no residual conditions into Spool 1, which is
     built locally on the AMPs.  Then we do a SORT to order Spool 1 by
     row hash.  The input table will not be cached in memory, but it is
     eligible for synchronized scanning.  The result spool file will
     not be cached in memory.  The size of Spool 1 is estimated to be 2
     rows.  The estimated time for this step is 0.04 seconds.
  7) We do a MERGE into df2.ordertbl from Spool 1.
  8) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
     an all-rows scan into Spool 3, which is redistributed by hash code
     to all AMPs.  Then we do a SORT to order Spool 3 by row hash.  The
     result spool file will not be cached in memory.  The size of Spool
     3 is estimated to be 2 rows.  The estimated time for this step is
     0.07 seconds.
  9) We do an all-AMPs JOIN step from Spool 3 (Last Use) by way of a
     RowHash match scan, which is joined to df2.lineitem.  Spool 3
     and df2.lineitem are joined using a merge join, with a join
     condition of ("df2.lineitem.l_orderkey = Spool_3.o_orderkey").
     The input table df2.lineitem will not be cached in memory, but
     it is eligible for synchronized scanning.  The result goes into
     Spool 2, which is built locally on the AMPs.  Then we do a SORT to
     order Spool 2 by row hash.  The result spool file will not be
     cached in memory.  The size of Spool 2 is estimated to be 20 rows.
     The estimated time for this step is 0.37 seconds.
 10) We do an all-AMPs RETRIEVE step from Spool 2 by way of an all-rows
     scan into Spool 4, which is built locally on the AMPs.  Then we do
     a SORT to order Spool 4 by join index.
 11) We do a MERGE DELETE to df2.orderjoinline from Spool 2 (Last
     Use).
 12) We do a MERGE into df2.orderjoinline from Spool 4 (Last Use).
 13) We spoil the parser's dictionary cache for the table.
 14) We spoil the parser's dictionary cache for the table.
 15) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> No rows are returned to the user as the result of statement 1.