Cogroups
Cogroups are used with protected and non-protected C/C++ and Java table operators.
Cogroups are used for table operators that contain multiple ON clauses. By default, the input from the ON clauses is pooled together into one group as long as the input has the same value as the partition key. Input that does not qualify is ignored to ensure that the results make sense.
For information about the use of multiple ON clauses, see SQL Data Manipulation Language.
The cogroup feature is on by default, but if necessary, you can turn off this functionality by calling FNC_TblOpDisableCoGroup() in the contract function.
For information about FNC calls, see SQL External Routine Programming.
Example : Cogroup Used for a Table Operator with Multiple Inputs
This example demonstrates the grouping operation used with multiple inputs to a table operator, called a cogroup. All of the inputs that satisfy the condition are grouped together in a single group. Cogroup matches the partition keys of multiple inputs, thus ensuring that the results are correct. Using a cogroup enhances the ability of table operators to work with multiple input streams and dimension tables.
Suppose that you have the following two tables:
WebLog |
||
cookie |
cart_amount |
page |
AAAA
|
$60
|
Thankyou
|
AAAA
|
$140
|
Thankyou
|
BBBB
|
$100
|
Thankyou
|
CCCC
|
|
Intro
|
CCCC
|
$200
|
Thankyou
|
DDDD
|
$100
|
Thankyou
|
AdLog |
||
cookie |
ad_name |
action |
AAAA
|
Champs
|
Impression
|
AAAA
|
Puppies
|
Click
|
BBBB
|
Apples
|
Click
|
CCCC
|
Baseball
|
Impression
|
CCCC
|
Apples
|
Click
|
You run the following query:
SELECT cookie, cart_amt, adname, action
FROM attribute_sales (
ON (SELECT cookie,cart_amt FROM weblog
WHERE page = 'thankyou' ) as W PARTITION BY cookie
ON adlog as S PARTITION BY cookie) as result1 ;
The inputs are grouped together similar to the following table:
Grouped and Nested Relations With the Cogroup on Cookie |
||
Cookie |
WebLog |
AdLog |
AAAA |
AAAA,$60, thankyou AAAA,$140,thankyou |
AAAA,champs,impression AAAA,puppies,click |
BBBB |
BBBB,$100,thankyou |
BBBB,apples,click |
CCCC |
CCCC,$200,thankyou |
CCCC,baseball,impression CCCC,apples,click |
DDDD |
DDDD,$100,thankyou |
|
The output of the query is similar to the following:
adname |
attr_revenue |
champs |
$40 |
puppies |
$160 |
apples |
$240 |
baseball |
$40 |
Each time you invoke the table operator, only the inputs that have the same partition key values participate in the invocation.
If there are no rows of the PARTITION BY ANY and PARTITION BY key inputs on an AMP and the query involves DIMENSION input, the table operator is not invoked even though the DIMENSION table has rows on that AMP.