The following EXPLAIN shows how the Optimizer may use the join index in a query when aggregation is performed on the join indexed rows.
EXPLAIN SELECT l_partkey, AVG(l_quantity), AVG(l_extendedprice) FROM lineitem , orders WHERE l_orderkey = o_orderkey AND o_orderdate > '2003-11-01' GROUP BY l_partkey;
Explanation
--------------------------------------------------------------------------
1) First, we lock LOUISB.order_join_line for read on a
reserved Row Hash to prevent a global deadlock.
2) Next, we do a SUM step to aggregate from join index table
LOUISB.order_join_line with a range constraint of (
"LOUISB.order_join_line.Field_1026 > 971101") with a residual
condition of ("(LOUISB.order_join_line.Field_1026 > 971101) AND (NOT
(LOUISB.order_join_line.o_orderdate IS NULL ))"), and the grouping
identifier in field 1. Aggregate Intermediate Results are
computed globally, then placed in Spool 3. The input table is not cached
in memory, but it is eligible for synchronized scanning.
3) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
an all-rows scan into Spool 1, which is built locally on the AMPs.
The size of Spool 1 is estimated to be 10 rows. The estimated time
for this step is 0.32 seconds.
4) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.