Example: Using the HASH BY Option With a NoPI Table - Advanced SQL Engine - Teradata Database

SQL Data Manipulation Language

Product
Advanced SQL Engine
Teradata Database
Release Number
17.00
Published
September 2020
Language
English (United States)
Last Update
2021-01-23
dita:mapPath
qtb1554762060450.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata Vantage™

Assume that you have created the following NoPI tables for this example:

     CREATE TABLE orders (
       o_orderkey    INTEGER NOT NULL,
       o_custkey     INTEGER,
       o_orderstatus CHAR(1) CASESPECIFIC,
       o_totalprice  DECIMAL(13,2) NOT NULL,
       o_ordertsz    TIMESTAMP(6) WITH TIME ZONE NOT NULL,
       o_comment     VARCHAR(79))
     UNIQUE INDEX (o_orderkey),
     PARTITION BY COLUMN;
     CREATE TABLE orders_staging AS orders
     WITH NO DATA
     NO PRIMARY INDEX;

The following INSERT … SELECT request redistributes rows by the hash value of o_orderkey to provide even distribution of the data selected from orders_staging and ordered locally to obtain better run length compression on o_ordertsz before locally copying into orders.

     INSERT INTO orders
       SELECT *
       FROM orders_staging
       HASH BY o_orderkey
       LOCAL ORDER BY o_ordertsz;

For the HASH BY clause, o_orderkey resolves to o_orderkey in the orders table. Because this corresponds to the first expression in the select expression list of the SELECT request, which is o_orderkey from orders_staging, the database distributes the spool generated for the SELECT request on o_orderkey. Similarly, the database uses orders.staging. o_ordertsz to order the spool before locally inserting into orders.

This example uses the same tables.

     INSERT INTO orders
       SELECT o_orderkey, o_custkey, o_orderstatus, o_totalprice + 10,
              o_ordertsz, o_comment
       FROM orders_staging
       HASH BY o_totalprice;

For the HASH BY clause, o_totalprice resolves to o_totalprice in the orders table. Because this corresponds to the fourth expression in the select expression list of the SELECT request, which is o_totalprice + 10,the database distributes the spool generated for the SELECT request on the value of this expression, not on the value of orders_staging.o_totalprice.

To distribute on the values of orders_staging.o_totalprice, qualify the reference in the HASH BY clause as the following revision of the previous INSERT … SELECT request does.

     INSERT INTO orders
       SELECT o_orderkey, o_custkey, o_orderstatus, o_totalprice + 10,
              o_ordertsz, o_comment
       FROM orders_staging
       HASH BY orders_staging.o_totalprice;