Multiple inputs are combined using what is essentially a cogroup operation with support for dimensional inputs. Grouping is by OUTER JOIN, with NULLs considered equal. The SQL-MapReduce function is invoked once for each unique partition in the partitioned inputs. All dimensional inputs are available at each invocation. The function can output one or more tuples at each invocation.
For determining what the unique partitions of the partitioned inputs are, there are two mutually exclusive cases:
- One or more partition_attributes_input inputs are combined into partitions using a cogroup operation.
The cogroup operation forms one partition for each unique combination of partitioning attributes present in any of the inputs. Each partition provides the values of the partitioning attributes and the tuples from each input that agree on those values. If a given input has no tuples for a particular combination of partitioning attributes, an empty set of tuples is provided for that input.
- A single partition_any_input is processed wherever its data is stored.
Each invocation provides the input tuples to the vworker where they currently reside in the database.