15.00 - Join and Hash Index Maintenance Considerations - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Join and Hash Index Maintenance Considerations

Each time a base table row is updated, its corresponding join or hash index data is modified within the same unit of work. Because MultiLoad and FastLoad are incompatible with join and hash indexes, maintenance of a base table that has a join or hash index must be performed using SQL, or the join or hash index must be dropped before the MultiLoad or FastLoad utility is run and the index then rebuilt afterward. As illustrated by the following figure, you can use SQL to take either a set processing approach or a row-at-a-time approach to join and hash index maintenance.

If an INSERT … SELECT request is used to load data into the base table, then the unit of work is the entire SQL request. In such a case the join or hash index maintenance is performed in batch mode. Inserts to the base table are spooled and applied to the join or hash index structure in a single step of the query plan. In particular, when an INSERT … SELECT request specifies the primary index value of the target row set in its select list or in its WHERE clause, a single‑AMP merge step is used to process the INSERT operation.

When row-at-a-time updates are performed, the join or hash index structure is updated concurrently, once per base table row.

Join and hash index maintenance is optimized to use localized rowhash‑level locking whenever possible. Table‑level locks are applied when the maintenance is performed using all‑AMPs operations such as spool merges.

When the table being updated contains a join or hash index, an EXPLAIN of the UPDATE request illustrates whether an all-AMPs operation and table‑level locking are used at the time of the UPDATE operation, or a single-AMP operation with rowhash locking.

The following is a list of the conditions that support single-AMP updating and rowhash‑level locking on join or hash indexes when the base table is updated a row‑at-a-time. Be aware that these optimizations might not be applied to complicated indexes when cardinality estimates are made with low confidence or when the index is defined on three or more tables with all its join conditions based on non-unique and non-indexed columns.

When a row-at-a-time INSERT on the base table is being performed, the following restrictions apply:

  • The join index can have a different primary index from the base table.
  • Inserts that specify an equality constraint on the primary index column set of a table with joins between some non-primary index columns of the table and the primary index columns of another table are optimized to use rowhash‑level locks. For example,
  • Given the following join index definition:

         CREATE JOIN INDEX j2 AS 
           SELECT x1,x2,x3,y2,y3,z2 
           FROM t1,t2,t3 
           WHERE x1=y1 
           AND   y2=z1 
         PRIMARY INDEX (y3);

    When a row is inserted into t1 with the following INSERT statement,

         INSERT INTO t1 
         VALUES (1,1,1);

    corresponding j2 rows are materialized by a query like the following:

         SELECT 1,1,1,y2,y3,z2 
         FROM t2,t3 
         WHERE y1=1 
         AND   y2=z1;

    As long as the number of rows that qualify t2.y1=1 (see step 2-1) is within the 10% of the number of AMPs threshold and the number of rows resulting from the t2 and t3 join step 5 is also within this threshold, this INSERT statement does not incur any table‑level locks.

    For those INSERT operations that specify an equality constraint on a non-primary index or that involve joins with non-primary index columns, temporary hash indexes can be created by the system to process them in such a way that single‑AMP retrieves are used and non-primary index joins are converted to primary index joins. See the following examples:

    Given the following join index definition:

         CREATE JOIN INDEX j1 AS 
           SELECT x1,x2,x3,y2,y3 
           FROM t1,t2 
           WHERE x2= y1 
         PRIMARY INDEX (x1);

    When a row is inserted into t2 with the following INSERT statement

         INSERT INTO t2 
         VALUES (1,1,1);

    corresponding j1 rows are materialized by a query like the following:

         SELECT x1, x2, x3 ,1 ,1 
         FROM t1 
         WHERE x2=1;

    Given the following join index definition:

         CREATE JOIN INDEX j1 AS
           SELECT x1,x2,x3,y2,y3 
           FROM t1,t2 
           WHERE x2=y1 
         PRIMARY INDEX (x1);

    When a row is deleted from t2, delete t2 where t2.y1=1, corresponding j1 rows are materialized by a query like the following:

         SELECT x1,x2,x3,y2,y3 
         FROM t1,t2 
         WHERE x2=y1 
         AND y1=1;

    Given the following join index definition:

         CREATE JOIN INDEX j2 AS 
           SELECT x1,x2,x3,y1,y3 
           FROM t1, t2 
           WHERE x2=y2;

    When a row is deleted from t2, delete t2 where t2.y1=1, corresponding j2 rows are materialized by a query which involves a non-PI‑to‑non-PI join:

         SELECT x1,x2,x3,y1,y3 
         FROM t1, t2 
         WHERE x2=y2 
         AND   y1=1;

    If a single-table non-covering join index is defined on t1.x2, calling it stji_t1_x2, the (t1.x2=t2.y2) join that is processed by duplicating the qualified t2 row in step 4 to join with t1 in step 5 can be processed as follows:

    1 a single-AMP retrieve from t2 by way of PI t2.y1=1 into Spool 2. Spool 2 is hash redistributed by t2.y2 and qualifies as a group-AMPs spool.

    2 a few-AMPs join from Spool 2 to stji_t2_x2 on y2=x2, results going into Spool 3. Spool 3 is redistributed by t1.rowid and also qualifies as a group‑AMPs spool.

    3 a few-AMPs join back from Spool 3 to t1 on ROWID.

    As long as the number of rows that qualify the join t1.x2=t2.y2 WHERE t1.y1=1 is within the 10% threshold, no table‑level locks are incurred for the DELETE statement.

    Given the following join index definition:

         CREATE JOIN INDEX j3 AS 
           SELECT x1,x2,x3,y2,y3,z2,z3
           FROM t1, t2, t3 
           WHERE x2=y2 
           AND   y1=z1 
         PRIMARY INDEX (y3);

    When a row is inserted into t1, INSERT INTO t1 VALUES (1,1,1), corresponding j3 rows are materialized by a query of the kind:

         SELECT 1,1,1,y2,y3,z2,z3 
         FROM t2,t3 
         WHERE y2=1 
         AND   y1=z1

    The index must be a single-table join index or hash index with the following exceptions:

  • The join index primary index is composed of a column set from the target table of the simple insert.
  • The join index primary index is composed of a column set that is joined, either directly or indirectly through transitive closure, to the target table of the simple insert.
  • In these cases, multitable join indexes are also optimized for single‑AMP merge steps with rowhash‑level locking.

    For example, given the following tables, the join index definitions that follow qualify for single‑AMP merge optimization:

         CREATE TABLE t1 (
           x1 INTEGER, 
           x2 INTEGER, 
           x3 INTEGER) 
         PRIMARY INDEX (x1);
     
         CREATE TABLE t2 (
           y1 INTEGER, 
           y2 INTEGER, 
           y3 INTEGER) 
         UNIQUE PRIMARY INDEX (y1);
     
         CREATE TABLE t3 (
           z1 INTEGER, 
           z2 INTEGER, 
           z3 INTEGER) 
         PRIMARY INDEX (z1);
     
         CREATE JOIN INDEX j1 AS 
          SELECT x1,x2,x3,y2,y3 
          FROM t1,t2 
          WHERE x2=y1 
         PRIMARY INDEX (x1);
     
         CREATE JOIN INDEX j2 AS 
          SELECT x1,x2,x3,y2,y3,z2,z3 
          FROM t1,t2,t3 
          WHERE x2=y1 
          AND   y1=z1 
         PRIMARY INDEX (x2);

    The following table indicates several base table inserts, their corresponding single‑AMP merge join index inserts, and the qualifying condition that permits the single‑AMP optimization:

     

    Base Table Insert

    Join Index Insert

    Qualifying Condition

    INSERT INTO t1 
    VALUES (1,1,1);
    INSERT INTO j1
    SELECT 1,1,1,y1,y3
    FROM t2, t3
    WHERE y1=1;
     
    INSERT INTO j2
    SELECT 1,1,1,y1,y3,z1,z3
    FROM t2,t3
    WHERE y1=1
    AND y1=z1;

    The primary index of the join index is composed of a column set from the target table of the simple insert.

    INSERT INTO t2
    VALUES (1,1,1);
    INSERT INTO j2
    SELECT x1,x2,x3,1,1,z2,z3
    FROM t1,t3
    WHERE x2=1
    AND 1=z1
    AND x2=z1;

    The primary index of the join index is composed of a column set that is joined directly or indirectly through transitive closure to the target table of the simple insert.

    INSERT INTO t3
    VALUES (1,1,1);
    INSERT INTO j2
    SELECT x1,x2,x3,y2,y3,1,1
    FROM t1,t2
    WHERE x2=y1
    AND y1=1;
  • When a row-at-a-time UPDATE on the base table is being performed, the following restrictions apply:
  • The value for the primary index of the join index must be specified in the WHERE clause predicate of the request.
  • The UPDATE cannot change the primary index of the join index.
  • When it is cost effective to access the affected join or hash index rows by means of a NUSI, it is done using rowhash locks and a direct update step. If only a few rows are updated (a few‑AMPs operation), rowhash READ locks are placed on the NUSI subtable for the index rows that are read. Rowhash locks are also applied to the base table using the rowID values extracted from the index rows.
  • When a row-at-a-time DELETE on the base table is performed, the following restrictions apply:
  • The value for the primary index of the join index must be specified in the WHERE clause predicate of the DELETE statement.
  • The deleted row must not be from the inner table of an outer join in the CREATE JOIN INDEX statement with the following exceptions:
  • The outer join condition in the join index is specified on a UPI column from the inner table.
  • The outer join condition in the join index is specified on a NUPI column from the inner table.
  • When it is cost effective to access the affected join or hash index rows by means of a NUSI, it is done using rowhash‑level locks and a direct delete step. If only a few rows are deleted (a few‑AMPs operation), rowhash‑level READ locks are placed on the NUSI subtable for the index rows that are read. Rowhash‑level locks are also applied to the base table using the rowID values extracted from the index rows.
  • Under all other conditions, a single-row update causes a table‑level WRITE lock to be placed on the hash or join index.

    If table‑level locks are reported in the EXPLAIN text, then consider using set processing approaches with one or more secondary indexes as an alternative.