16.20 - Partitioned Tables and Join Indexes - Teradata Database - Teradata Vantage NewSQL Engine

Teradata Vantage™ SQL Data Definition Language Detailed Topics

Product
Teradata Database
Teradata Vantage NewSQL Engine
Release Number
16.20
Release Date
March 2019
Content Type
Programming Reference
Publication ID
B035-1184-162K
Language
English (United States)

You can create join indexes on a table that is row-partitioned, column-partitioned, or multilevel column-partitioned with one or more row partitioning levels and you can also row partition or column partition a join index. See Rules and Restrictions for Column-Partitioned Join Indexes.

The partitioning expressions for a join index can be based on either numeric or character data. See Partitioned and Nonpartitioned Primary Indexes.

Row partitioning for a join index functions as a constraint on the rows that can be updated in its underlying base table, for example with an SQL DELETE, INSERT, MERGE, or UPDATE operation. Define partitioning expressions for a join index that do not prevent rows from being inserted into, updated, or deleted from the base tables.

You can create a partitioning for a join index that specifies the DATE, CURRENT_DATE, or CURRENT_TIMESTAMP functions in one or more of its partitioning expressions. See Partitioning Expressions Using DATE or TIMESTAMP Built-In Functions. Use the ALTER TABLE TO CURRENT statement to reconcile the date and timestamp values of these partitioning expressions periodically. See ALTER TABLE TO CURRENT.

Row partitioning is valuable for optimizing the performance of range queries on the partitioning column set, but can be neutral or even suboptimal for non-range queries or for range queries on columns that are not part of the partitioning expression for the join index, so you should test the performance of row-partitioned join indexes carefully before introducing them to your production environment.

Depending on the application, you can use the NO RANGE and UNKNOWN partitions (see Purpose and Behavior of the NO RANGE and UNKNOWN Partitions) to gracefully handle INSERT, UPDATE, and DELETE requests that cause one or more of the partitioning expressions in the join index to evaluate to null without otherwise aborting such problematic update requests. Update is used here in its generic sense, meaning any delete, insert, update, or merge operation.

You can specify a WHERE clause in the CREATE JOIN INDEX statement to create a sparse join index for which only those rows that meet the condition of the WHERE clause are inserted into the index, or, for the case of a row in the join index being updated in such a way that it no longer meets the conditions of the WHERE clause after the update, cause that row to be deleted from the index. See Sparse Join Indexes.

The process for this activity is as follows:

  1. Teradata Database checks the WHERE clause condition for its truth value after the update to the row.
    Condition Result Description
    FALSE Deletes the row from the sparse join index.
    TRUE Retains the row in the sparse join index and proceeds to stage b.
  2. Teradata Database evaluates the new result of the partitioning expression for the updated row.
    Partitioning Expression Result Description
    Null or to a value outside the range 1 through 65,035, inclusive, for 2-byte partitioning or outside the range 1 through 9,223,372,036,854,775,807, inclusive, for 8-byte partitioning. Base table or the sparse join index not updated, error returned.
    Value between 1 and 65,035, inclusive for 2-byte partitioning or to a value between 1 and 9,223,372,036,854,775,807, inclusive, for 8-byte partitioning. Stores the row in the appropriate partition, which might be different from the partition in which it was previously stored, and continues processing requests.

A sparse join index is also helpful because Teradata Database checks its WHERE clause first when index maintenance is needed and the partitioning expression constraint is only checked if the WHERE clause condition can be satisfied.

You must consider all of the following factors when you are deciding what join indexes to create to support your workloads:
  • Data model
    • Table definitions
    • Foreign Key-Primary Key constraints
    • Dimensional hierarchies
  • Workload
    • Query patterns
    • Query frequencies
    • Query priorities
  • Data maintenance
    • Frequency of maintenance
    • Constraints specified by data maintenance statements and commands

Consider the advantages and the disadvantages of partitioning and join indexes to ensure that you define appropriate indexes for your workloads. Use EXPLAIN request modifiers to verify whether your join index is actually used and to determine whether the Optimizer applies any partition-related optimizations to the query.

Expression evaluation errors, such as divide by zero, can occur during the evaluation of a partitioning expression. The system response to such an error varies depending on the session mode in effect at the time the error occurs.

Session Mode Roll Back Performed
ANSI Request that contains the aborted statement.
Teradata Transaction that contains the aborted request.

When you design your partitioning expressions, you should construct them in such a way that expression errors either cannot, or are very unlikely to, occur.

Depending on the circumstances, you can use either an ALTER TABLE statement to revalidate the partitioning of a join index or an ALTER TABLE TO CURRENT statement. See General Rules for the MODIFY PRIMARY Clause and Comparing ALTER TABLE TO CURRENT and ALTER TABLE … REVALIDATE.

The ALTER TABLE TO CURRENT statement enables you to refresh the content of a join index without first dropping it and then recreating it. The relative efficiency of ALTER TABLE TO CURRENT requests when compared to the drop and create ALTER TABLE alternative depends on how often you must submit ALTER TABLE requests to update your join indexes as well as the type of DATE, CURRENT_DATE, or CURRENT_TIMESTAMP conditions that are defined in the partitioning.

If you refresh the join index infrequently and the DATE, CURRENT_DATE, or CURRENT_TIMESTAMP condition requires a large volume of old rows to be removed and a large volume of new rows to be inserted, it can be more efficient to drop and recreate the join index.

For obvious cases, such as when the partitioning column set of a join index is updated as a result of a new DATE, CURRENT_DATE, or CURRENT_TIMESTAMP value, ALTER TABLE TO CURRENT requests internally delete all rows from the partitioning expression and then rebuild it using the new DATE, CURRENT_DATE, or CURRENT_TIMESTAMP value.

For the less obvious cases that require a DELETE ALL operation followed by rebuilding the index, the method is more efficient, but ALTER TABLE TO CURRENT requests do not use it, so you should consider using the drop-and-recreate alternative.

When you drop and then rebuild a join index, the existing security privileges and any statistics that have been collected on the join index are also dropped when you drop the join index.

When you must submit an ALTER TABLE request on both a base table and its join indexes, you should consider submitting ALTER TABLE requests for those join indexes that only specify a lower bound DATE, CURRENT_DATE, or CURRENT_TIMESTAMP condition. A lower bound DATE, CURRENT_DATE, or CURRENT_TIMESTAMP condition is a condition that results in rows from the join index being deleted when the DATE, CURRENT_DATE, or CURRENT_TIMESTAMP is refreshed to the latest date or timestamp value, for example, when j > CURRENT_DATE.

Using this method, any join index maintenance required after submitting an ALTER TABLE request against the base table need not be done unnecessarily on those join index rows that would have been deleted by an ALTER TABLE request on the join index.

For additional information about partitioned primary indexes and the rules and recommendations for using them, see Partitioned and Nonpartitioned Primary Indexes and Rules and Usage Notes for Partitioned Tables, and Teradata Vantage™ - Database Design, B035-1094.

Teradata Database uses both fast path DELETE and fast path deferred partition DELETE operations for the following join index maintenance cases.
  • Fast path deferred partition DELETE operations for the following scenarios.
    • Tables that are defined with a nonpartitioned join index
    • Nonpartitioned tables defined with a partitioned join index.
    • PPI tables defined with a PPI join index.
  • Fast path DELETE operations to multiple DELETE ALL steps for implicit multistatement request transactions or multistatement transactions that contain multiple DELETE ALL requests. Teradata Database uses fast path DELETE operations in both ANSI and Teradata session modes to process the DELETE ALL table_name requests if they are the last requests in the transaction that reference table_name.
  • Fast path DELETE operations for the following cases when tables have a simple or aggregate join index and the index is one of the following:
    • Single-table join index.
    • Multitable join index defined with an inner join.
    • Multitable join index where the table being deleted is the outer table of an outer join.

    The preceding criteria apply to the following cases.

    • DELETE ALL operations on a table that has a simple or aggregate join index. Teradata Database uses fast path DELETE operations to process both the base table and its join index.
    • Conditional DELETE operations on a table that has a simple or aggregate join index. Teradata Database uses fast path DELETE operations to process only the join index if the DELETE condition covers the entire index.

The join index partition DELETE operations and DELETE ALL operations greatly facilitate the performance of join index maintenance.