Applying Limit with Order - Aster Execution Engine

Teradata Aster® Developer Guide

Product
Aster Execution Engine
Release Number
7.00.02
Published
July 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
xnl1494366523182.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
ffu1489104705746
lifecycle
previous
Product Category
Software

With SQL-MapReduce Collaborative Planning, you can add logic to your SQL-MapReduce function that, in response to a Planner request, applies limit if the function produces data in the desired order.

For example, consider this query, which retrieves the first 20 rows of session information ordered on userid and ts:

SELECT userid, ts, sessionid
FROM Sessionize(ON clickstream
  PARTITION BY userid
  ORDER BY ts
  TIMECOLUMN ('ts')
  TIMEOUT (60))
ORDER BY userid, ts
LIMIT 20;

Without SQL-MapReduce Collaborative Planning, the plan consists of the following steps:

  1. Execute the following query and store the output data in the temporary table tmp1:
    Sessionize (SELECT * FROM clickstream ORDER BY userid, ts)
  2. Sort the data in tmp1.
    SELECT * FROM tmp1 ORDER BY userid, ts LIMIT 20;
  3. Send the first 20 rows of data to the Queen from each vWorker.
  4. Return the first 20 rows of data to the application from the Queen.

However, with SQL-MapReduce Collaborative Planning logic added to the Sessionize function, it can respond to the Planner request by applying limit and returning the first 20 rows on each vWorker after executing the query in step 1. This allows the Planner to eliminate the expensive sort and limit step (see step2 above), resulting in a significant optimization, as shown in the following figure.

The function must produce output data in the same order that is expected by LIMIT (in this case, 'userid, ts'), in order for the limit pushdown to the function to be correct.
Limit with order example

Note that since each vWorker returns 20 rows to the Queen, the final limit application to choose the top 20 rows is needed.

During the planning time, an instance of the function is created on a randomly chosen vWorker. The runtime contract negotiation as well as planning contract negotiation happens between the Queen and the function’s instance.