Example: HASH BY Option with NoPI Table - Analytics Database - Teradata Vantage

SQL Data Manipulation Language

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-12-13
dita:mapPath
pon1628111750298.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
esx1472246586715
lifecycle
latest
Product Category
Teradata Vantage™

Assume that you have created the following NoPI tables for this example:

     CREATE TABLE orders (
       o_orderkey    INTEGER NOT NULL,
       o_custkey     INTEGER,
       o_orderstatus CHAR(1) CASESPECIFIC,
       o_totalprice  DECIMAL(13,2) NOT NULL,
       o_ordertsz    TIMESTAMP(6) WITH TIME ZONE NOT NULL,
       o_comment     VARCHAR(79))
     UNIQUE INDEX (o_orderkey),
     PARTITION BY COLUMN;
     CREATE TABLE orders_staging AS orders
     WITH NO DATA
     NO PRIMARY INDEX;

The following INSERT ... SELECT request redistributes rows by the hash value of o_orderkey to provide even distribution of the data selected from orders_staging and ordered locally to obtain better run length compression on o_ordertsz before locally copying into orders.

     INSERT INTO orders
       SELECT *
       FROM orders_staging
       HASH BY o_orderkey
       LOCAL ORDER BY o_ordertsz;

For the HASH BY clause, o_orderkey resolves to o_orderkey in the orders table. Because this corresponds to the first expression in the select expression list of the SELECT request, which is o_orderkey from orders_staging, the database distributes the spool generated for the SELECT request on o_orderkey. Similarly, the database uses orders.staging.o_ordertsz to order the spool before locally inserting into orders.

This example uses the same tables.

     INSERT INTO orders
       SELECT o_orderkey, o_custkey, o_orderstatus, o_totalprice + 10,
              o_ordertsz, o_comment
       FROM orders_staging
       HASH BY o_totalprice;

For the HASH BY clause, o_totalprice resolves to o_totalprice in the orders table. Because this corresponds to the fourth expression in the select expression list of the SELECT request, which is o_totalprice + 10,the database distributes the spool generated for the SELECT request on the value of this expression, not on the value of orders_staging.o_totalprice.

To distribute on the values of orders_staging.o_totalprice, qualify the reference in the HASH BY clause as the following revision of the previous INSERT ... SELECT request does.

     INSERT INTO orders
       SELECT o_orderkey, o_custkey, o_orderstatus, o_totalprice + 10,
              o_ordertsz, o_comment
       FROM orders_staging
       HASH BY orders_staging.o_totalprice;