15.00 - Step Building Logic for Recursive Queries - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1184-015K
Language
English (United States)

Step Building Logic for Recursive Queries

The Teradata Database parses recursive queries to identify the seed and recursive parts. The system defines the seed component to be all SELECT statements in the body of the recursive query that do not make any recursive references. By default, all that remains is defined to be the recursive component.

Consider the following recursive view definition as an example.

    CREATE RECURSIVE VIEW tc (source, destin, carrier, depth) AS (
      SELECT f.source, f.destin, f.carrier, 0 AS depth
      FROM flights AS f
      WHERE f.source = 'Paris'
    UNION ALL
      SELECT tcq.source, f.destin, f.carrier, tc.depth+1 AS depth
      FROM tc tcq, flights AS f
      WHERE tcq.destin=f.source 
      AND   depth <= 100
    UNION ALL
      SELECT tcq.source, r.destin, 'EuroRail', tc.depth+1 AS depth
      FROM tc AS tcq, trains AS r
      WHERE tcq.destin=r.source 
      AND   depth <= 100
    UNION ALL
      SELECT r.source, r.destin, 'EuroRail', 0 AS depth
      FROM trains AS r
      WHERE r.source = 'Kaiserslautern');

The seed query component of the view definition is highlighted in boldface text. The system treats all the rest of the SQL text as the recursive component of the definition.

Consider the following view definition as the foundation for examining how the Optimizer builds the steps to process a query made against it.

    CREATE RECURSIVE VIEW reachable_from (destin, cost,depth) AS (
      SELECT root.destin,root.cost, 0 AS depth
      FROM flights AS root
      WHERE root.source = 'paris'            
    UNION ALL
      SELECT result.destin, seed.cost + result.cost, seed.depth + 1 
             AS newdepth
      FROM reachable_from AS seed, flights AS result    
      WHERE seed.destin = result.source 
      AND   newdepth <= 20);

Now execute an EXPLAIN request modifier on the following query to see the steps the Optimizer builds to process this view.

    EXPLAIN SELECT * 
    FROM reachable_from;
 
Explanation
---------------------------------------------------------------------------
  1) First, we lock a distinct DF2."pseudo table" for read on a RowHash
     to prevent global deadlock for DF2.Root.
  2) Next, we lock DF2.Flights for read.
  3) We do a single-AMP RETRIEVE step from DF2.Root by way of the
     primary index "Root.Source = 'Paris'" with no residual
     conditions into Spool 1 (all_amps), which is built locally on the AMPs.  
     The size of Spool 1 is estimated with low
     confidence to be 5 rows.  The estimated time for this step is 0.56
     seconds. 
  4) We do an all-AMPs RETRIEVE step from Spool 1 by
     way of an all-rows scan into Spool 2 (all_amps), which is
     built locally on the AMPs.  The size of  Spool 2 is estimated
     with low confidence to be 5 rows.  The estimated time for
     this step is 0.67 seconds.      
  5) We do an all-AMPs JOIN step from DF2.Result by way of a RowHash
     match scan, which is joined to Spool 1 (Last Use) by way of a
     RowHash match scan.  DF2.Result and Spool 1 are joined using a
     merge join, with a join condition of ("Destin = Result.Source")  AND 
     "((Depth+1)<= 20)"
     The result goes into Spool 3 (all_amps), which is built locally on
     the AMPs.  The size of Spool 3 is estimated with no confidence to
     be 25 rows.  The estimated time for this step is 0.61 seconds.
  6) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by
     way of an all-rows scan into Spool 1 (all_amps), which is
     which is built locally on the AMPs.  The size of Spool 1 is estimated
     with low confidence to be 25 rows.  The estimated time for
     this step is 0.67 seconds. 
     If one or more rows are inserted into spool 1, then go to step 4.
  7) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of
     an all-rows scan into Spool 4 (group_amps), which is built locally
     on the AMPs.  The size of Spool 4 is estimated with no confidence
     to be 1255 rows.  The estimated time for this step is 0.67 seconds.
  8) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 4 are sent back to the user as the result of
     statement 1.  The total estimated time is 2.50 seconds.

Note the following things about this hypothetical EXPLAIN report.

  • Spool 1 in the plan is used to compute the result of the steps built for the seed query.
  • Step 4 feeds the seed query steps result into Spool 2, which then contains the final result.
  • Step 5 performs the recursive steps of the query.
  • It uses Spool 1 as the seed result and joins that to flights.

  • The result of that join, Spool 3, is then treated as the next seed.
  • To do this, Spool 3 feeds into Spool 1 which is used again by the seed query step building process.

    Note that Spool 1 is cleared in step 5 by marking it as “last use”.

  • The Optimizer marks the recursive steps block starting at step 4 and ends at step 6.
  • That block is then executed repeatedly until the result of step 6 is empty.

  • The Dispatcher does the repetitive execution of the recursive steps block.
  • In Step 7, the retrieval of the rows from Spool 2 into Spool 4 produces the final result set that is returned to the requestor.