HASH BY Clause
Purpose
Hashes rows across the AMPs in preparation to be input to a table function or table operator.
Syntax
where:
Syntax element … |
Specifies … |
column_name |
the column name set by which rows are to be partitioned across the AMPs before being input to a table function or table operator. |
ANSI Compliance
HASH BY is a Teradata extension to the ANSI SQL:2011 standard.
Rules and Restrictions
The rules and restrictions are:
Example : Hash Ordering Input Parameters to a Table Function
This example shows the use of the HASH BY clause for a table UDF. The table function add2int takes two integer values as input and returns both of them and their sum.
The query in this example selects all columns from add2int, which specifies that its input, dt, is hashed by dt.y1. The specified hashing might not be relevant to the add2int function and is only used for illustration.
The expected processing of the query is that dt is first spooled and then hashed by y1 among the AMPs. The final hashed spool is used as the input to add2int.
CREATE TABLE t1 (
a1 INTEGER,
b1 INTEGER);
CREATE TABLE t2 (
a2 INTEGER,
b2 INTEGER);
CREATE FUNCTION add2int (
a INTEGER,
b INTEGER)
RETURNS TABLE (
addend1 INTEGER,
addend2 INTEGER,
mysum INTEGER)
SPECIFIC add2int
LANGUAGE C
NO SQL
PARAMETER STYLE SQL
NOT DETERMINISTIC
CALLED ON NULL INPUT
EXTERNAL NAME 'CS!add3int!add2int.c';
WITH dt(x1,y1) AS (SELECT a1,b1
FROM t1)
SELECT *
FROM TABLE (add2int(dt.x1,dt.y1)
HASH BY y1) AS aa;
Example 2: HASH BY and LOCAL ORDER BY Clauses in the Same Statement
This example shows the use of the HASH BY and LOCAL ORDER BY clauses for sorting the input to a table UDF.
Some applications need to enforce the ordering of input to table UDFs. Rather than sorting the input arguments in the application or table UDF, you can specify the HASH BY and LOCAL ORDER BY clauses when you invoke the table UDF in the FROM clause of a SELECT request. The scope of input includes derived tables, views, and WITH objects.
Consider this table definition:
CREATE TABLE tempdata (
tid INTEGER,
tts TIMESTAMP,
x INTEGER,
y INTEGER);
Suppose the data in tempdata looks like this:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Now consider a table UDF named char_from_rows that produces a text string that represents all of the timestamp, x, and y values in rows that have the same value for tid. Furthermore, the values in the text string are ordered by timestamp. Here is the definition of the table UDF:
CREATE FUNCTION char_from_rows(tid INTEGER,
tts TIMESTAMP,
x INTEGER,
y INTEGER)
RETURNS TABLE(outID INTEGER,
outCHAR VARCHAR(64000))
LANGUAGE C
NO SQL
EXTERNAL NAME 'CS!charfromrows!udfsrc/charfromrows.c'
PARAMETER STYLE SQL;
The following statement invokes the char_from_rows table UDF, using a nonrecursive WITH clause to hash the input by tid and value order the input on each AMP by tts:
WITH wq (tID1, tTS1, x1, y1) AS
(SELECT tID, tTS, x, y
FROM tempData)
SELECT *
FROM TABLE (char_from_rows(wq.tID1, wq.tTS1, wq.x1, wq.y1)
HASH BY tID1 LOCAL ORDER BY tTS1) AS tudf;
The output looks like this:
outID outCHAR
----- ---------------------------------------------------------------------
1001 2008-02-03 14:33:1510112008-02-03 14:44:2020242008-02-03 14:59:083130
1002 2008-02-04 11:02:1910112008-02-04 11:33:0422182008-02-04 11:48:272929
For More Information
For more information about the use of the HASH BY clause, see: