Using a Unique Join Index in the Access Path for a Query | Teradata Vantage - Using a Unique Join Index in the Access Path for a Query - Advanced SQL Engine - Teradata Database

SQL Request and Transaction Processing

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-24
dita:mapPath
ykx1561500561173.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1142
lifecycle
previous
Product Category
Teradata Vantage™

When multiple unique join indexes are defined on a single table, the Optimizer can use any of them for an access path. Any unique join indexes defined on a single table compete with one other for access path selection based on their costs, and the index that is determined to have the lowest cost when used as an access path is selected for the query plan by the Access Path Planner.

You should always collect statistics on the primary index of a unique join index, whether it is user-defined or system-defined, so the Optimizer can use those statistics to most accurately estimate the cost of using a particular unique join index.

The Optimizer can use the cardinality of a unique join index to match the base table predicates specified in a query, and cardinality estimates made using that method are reported with high confidence (see EXPLAIN Confidence Levels and Using Join Index Statistics to Estimate Single-Table Expression Cardinalities).

Unique Join Index Subsets and Access Path Selection

The Optimizer can use a unique join index such as a USI as an access path for single-row access in 2-AMP plans if the query specifies an equality condition on the columns that are defined as the primary index for the unique join index. The value in the equality condition forms the primary index key for accessing the unique join index.

For a unique join index whose primary index is defined explicitly as unique, as it is the case for user-defined unique join indexes, the coverage testing and the presence of an equality condition on the primary index column set of the join index guarantees that the join index returns at most one row and thus is sufficient to qualify the join index to be used for single-row access.

For system-defined unique join index, coverage testing is based on the VALIDTIME and TRANSACTIONTIME conditions that the join index is defined with as well as the VALIDTIME and TRANSACTIONTIME conditions in the query. Teradata Database adds VALIDTIME and TRANSACTIONTIME conditions to a query that specifies one or more temporal qualifiers.

You can also specify additional VALIDTIME and TRANSACTIONTIME conditions in the query to further restrict the result set. All VALIDTIME and TRANSACTIONTIME conditions found in the query are used for coverage testing.

If one user-defined unique join index is a subset of another, and both are in contention for being selected for the access path for a request, the Optimizer selects the index that has fewer columns because it can hash on a smaller key.

For example, suppose you create the following 2 unique join indexes on t1:

CREATE JOIN INDEX t1_ji1 AS
SELECT a1, ROWID
FROM t1
UNIQUE PRIMARY INDEX (a1);
CREATE JOIN INDEX t1_ji2 AS
SELECT a1, b1, ROWID
FROM t1
UNIQUE PRIMARY INDEX (a1,b1);

You then submit the following SELECT request against t1.

SELECT c1
FROM t1
WHERE a1=10
AND   b1= 5;

Both t1_ji1 and t2_ji2 can be used as access paths for this request, but the Optimizer selects t1_ji1 because it has fewer columns.

However, for a system-defined join index, the Optimizer cannot use the fewest columns heuristic because for the same value on the primary index, there can be multiple rows in the join index, and the Optimizer must consider the overall costing, including the number of rows per value and the size of the table.

Unique Join Indexes as an Access Path in Two-AMP Query Plans

You can define both a non-compressed and a non-value-ordered single-table join index with a unique primary index.

The following are the valid forms of unique join index:
  • A user-defined unique join index, which is a non-compressed single-table join index that is defined with a WHERE clause, contains base table ROWID in its select list and is defined with a unique primary index. Furthermore, the unique join index covers the query.

    In this case, coverage means the WHERE clause of the unique join index qualifies a row set that is a super set of the row set qualified by the WHERE clause of the query against the same table.

  • A system-defined unique join index, which is a system-defined non-compressed single-table join index used to enforce a temporal UNIQUE constraint, and the system-defined join index covers the query.

    In this case, coverage also refers to the WHERE clause coverage with an additional requirement that the table is queried using a temporal qualifier. For information about temporal tables, see Teradata Vantage™ - ANSI Temporal Table Support , B035-1186 and Teradata Vantage™ - Temporal Table Support , B035-1182 .

A unique join index can be used like a USI as an access path for single-row access in 2-AMP plans if the query specifies an equality condition on the columns that are defined as the UPI for the unique join index. The value in the equality condition forms the primary index key for accessing the unique join index.

For a unique join index whose primary index is defined as unique explicitly, as it is the case for user-defined unique join index, the coverage testing and the presence of an equality condition on the join index primary index column set guarantees the join index to return at most one row, and thus is sufficient to qualify the join index to be used for single-row access.

For system-defined unique join indexes, coverage testing is based on the VALIDTIME and TRANSACTIONTIME conditions that define the join index as well as the VALIDTIME and TRANSACTIONTIME conditions in the query. See Teradata Vantage™ - Temporal Table Support , B035-1182 for the definitions and use of these terms.

Vantage adds VALIDTIME and TRANSACTIONTIME conditions to a query that specifies temporal qualifies (see Teradata Vantage™ - Temporal Table Support , B035-1182 for details). You can also specify additional VALIDTIME and TRANSACTIONTIME conditions in the query to further restrict the result set.

The Optimizer uses all of the VALIDTIME and TRANSACTIONTIME conditions you specify in a query for coverage testing.

For example, for a CURRENT UNIQUE constraint defined on the TRANSACTIONTIME dimension of the Transaction Time table part_types, which is defined as follows:

CREATE MULTISET TABLE part_types (
  part_id       INTEGER NOT NULL,
  part_name     VARCHAR(20),
  part_type     CHARACTER(2)
  part_duration PERIOD(TIMESTAMP(6) WITH TIME ZONE) NOT NULL AS
                TRANSACTIONTIME,
  CURRENT TRANSACTIONTIME UNIQUE (part_id))
  PRIMARY INDEX (part_name);

Vantage defines a system-defined join index for part_types as follows:

CREATE SYSTEM_DEFINED JOIN INDEX  part_types_TJI004 AS
  CURRENT TRANSACTIONTIME SELECT ROWID, part_id, part_duration
  FROM part_types
UNIQUE PRIMARY INDEX (part_id);

The Optimizer adds the following condition to a request that queries the same table using an AS OF CURRENT_DATE qualifier:

BEGIN (part_id)<=CURRENT_DATE AND
END   (part_id)> CURRENT_DATE

Because the system-defined join index has a TRANSACTIONTIME condition that qualifies only Open rows to be stored in the join index, the index cannot cover the AS OF query by default. However, if the AS OF query explicitly specifies a TRANSACTIONTIME condition to qualify Open rows, then the join index can cover the query.

If a system-defined join index is used to enforce a unique constraint in the Transaction Time dimension only, an equality condition on the primary index column set of a system-defined join index is enough to guarantee that the join index returns at most one row.

However, if a system-defined join index is used to enforce a unique constraint in the Valid Time dimension, the index can contain multiple rows with the same value in column_list as long as they do not overlap the validtime_column. This means an equality condition on column_list with a unique constraint in the Valid Time dimension returns at most one row only if the table is queried for a single point in the Valid Time dimension. That is, the table is queried with a CURRENT or an AS OF temporal qualifier.

The conditions the Optimizer adds to a query for the CURRENT or AS OF qualifiers must also be evaluated as residual conditions on the system-defined join index to qualify the row that corresponds to that point in time. Furthermore, the system-defined join index must not have a partitioned primary index to avoid having to search multiple row partitions for the qualified row.

Therefore, in order for a system-defined join index that enforces a unique constraint in the Valid Time dimension to be usable for single-row access, the following things must be true:
  • The system-defined join index must not be defined with a PPI.
  • The system-defined join index must pass coverage testing.
  • The query must specify either a CURRENT or an AS OF temporal qualifier on the table.
  • The query must specify an equality condition on the columns that are defined as the primary index of the join index.

The following table summarizes when a system-defined join index qualifies for access path usage given that the coverage testing passes and the query specifies an equality condition on the primary index column set of the join index:

DML Qualifier Temporal Constraint Time Constraint Type Does a System-Defined Unique Join Index Qualify for Use in the Query Access Path?
CURRENT VALIDTIME CURRENT UNIQUE Yes
SEQUENCED UNIQUE Yes
TRANSACTIONTIME CURRENT UNIQUE Yes
AS OF date_expression | timestamp_expression VALIDTIME CURRENT UNIQUE Yes if date_expression | timestamp_expression >= resolved value of the DATE, CURRENT_DATE, or CURRENT_TIMESTAMP for the single-table unique join index.
SEQUENCED UNIQUE Yes
TRANSACTIONTIME CURRENT UNIQUE Yes if the TRANSACTIONTIME for the query is UNTIL_CLOSED
SEQUENCED [period_of_applicability] VALIDTIME CURRENT UNIQUE No
SEQUENCED UNIQUE No
TRANSACTIONTIME CURRENT UNIQUE Yes if the TRANSACTIONTIME for the query is UNTIL_CLOSED
Because the use of a unique join index as an access path is limited to queries with an equality condition on the primary index columns, and a 2-AMP step requires a read of the base table for the qualified RowId, it is possible to find a more optimal plan using the same join index as a covering join index. The Query Rewrite subsystem can also use a unique join index as a regular join index for query rewrites. This means that a unique join index can play a dual role in query optimization.
  • It can be used as a regular join index and as either a partially covering or fully covering join index to rewrite the query.
  • If the query is not rewritten to use the unique join index, the unique join index can be treated as a unique index by the Access Path Planner.

    The Optimizer can use a unique join index as an access path if there is an equality condition on the base table columns that correspond to the primary index for the join index.

Consider the following example:

CREATE TABLE t1 (
  a1 INTEGER,
  b1 INTEGER,
  c1 INTEGER);
CREATE JOIN INDEX uji AS
  SELECT b1, c1, ROWID
  FROM t1
  WHERE c1 BETWEEN 200 AND 1000
UNIQUE PRIMARY INDEX (b1);

Suppose you submit the following SELECT request against t1:

SELECT b1,c1
FROM t1
WHERE b1=10
AND   c1=500;
The Optimizer evaluates the following plans for this SELECT request:
  • A plan that does a single-AMP retrieval from uji.

    This plan results from the join index rewrite that replaces t1 in the request with uji.

  • A plan that does a 2-AMP retrieval using uji

    This plan results from using uji as an access path.

As is always true, the Optimizer selects the plan with the lower cost.