Usage Notes - Teradata Database

SQL Data Manipulation Language

Product
Teradata Database
Release Number
16.10
Published
June 2017
Language
English (United States)
Last Update
2018-04-25
dita:mapPath
psg1480972718197.ditamap
dita:ditavalPath
changebar_rev_16_10_exclude_audience_ie.ditaval
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata® Database

Definition of Upsert

A simple upsert is a standalone update operation for which a subsequent standalone insert operation is coded that is performed only if the specific row to be updated is found not to exist. These actions require the performance of two individual SQL requests in the case where no row is found to update. This was the only means of performing a conditional insert on an update failure prior to the introduction of the upsert form of the UPDATE statement.

The more high-performing atomic upsert operation, represented by the upsert form of the UPDATE statement, is a single SQL request that includes both UPDATE and INSERT functionality. The specified update operation performs first, and if it fails to find a row to update, then the specified insert operation performs automatically.

Purpose of the Atomic Upsert Operation

The single-pass upsert is described as atomic to emphasize that its component UPDATE and INSERT SQL statements are grouped together and performed as a single, or atomic, SQL request.

Existing Teradata Parallel Data Pump scripts that use the upsert feature in its Teradata Parallel Data Pump-specific form, using paired UPDATE and INSERT requests, do not have to be changed to take advantage of the atomic upsert capability because Teradata Parallel Data Pump automatically converts an UPDATE … INSERT request pair in its older syntax into a corresponding atomic upsert request whenever appropriate.

The atomic upsert syntax can also be used by any CLIv2-based applications and is particularly useful when coded directly into BTEQ scripts.

Rules for Using Atomic Upsert

The following table describes the fundamental constraints on an atomic upsert operation.

Constraint Explanation
The UPDATE and INSERT components of the upsert operation must specify the same table. The purpose of an atomic upsert is to first attempt to update a particular row in a table and then, if that row is not found, to insert it into the table.

By specifying different tables for the UPDATE and INSERT components of the statement, the intent of the operation is violated.

The table must have a primary index, which can be row partitioned but not column partitioned. A primary index is needed to search for the row.
The UPDATE and INSERT components of the upsert operation must specify the same row.

In other words, the primary index value for the inserted row is identical to the primary index value for the targeted update row.

If an upsert operation cannot find the specific row it targets, then it inserts the row it would have otherwise updated, inserting the specific “update” values into the fields that would have been updated had the row existed in the table.

This constraint is met when the primary index value specified by the WHERE clause of the UPDATE component matches the primary index value implied by the column values specified in the INSERT component.

Because the value of the number generated for an INSERT into an identity column is not knowable in advance, you cannot perform an upsert operation into a target table that has an identity column as its primary index.

Upserts into identity column tables for which the identity column is not the primary index are valid.

The UPDATE component fully specifies the primary index value to ensure that all accesses are one-AMP operations. When the UPDATE component of the statement fully specifies the primary index, the system accesses any targeted row with a single-AMP hashed operation.

This rule is applied narrowly in the Teradata Parallel Data Pump case, where it is taken to mean that the primary index is specified in the WHERE clause of the UPDATE component as equality constraints using simple constraint values (either constants or USING clause references to imported data fields). Simple values are also assumed for constraints on non-index columns and for the update values in the SET clause, avoiding any costly subqueries or FROM clause references. Similarly, TPump has a performance-driven preference to avoid subqueries in the INSERT.

When you perform an upsert UPDATE to a row-partitioned table, the following rules must be followed or an error is returned:
  • The values of the partitioning columns must be specified in the WHERE clause of the UPDATE clause of the statement.
  • The INSERT clause of the statement must specify the same partition as the UPDATE clause.
  • The UPDATE clause must not modify a partitioning column.

The outcome of partitioning expression evaluation errors, such as divide-by-zero errors, depends on the session mode.

In this session mode … Expression evaluation errors roll back this work unit …
ANSI request.
Teradata transaction.

Locks and Concurrency

An UPDATE (Upsert form) operation sets a WRITE lock for the row being updated. For a nonconcurrent load isolated update on load isolated table, the update operation sets an EXCLUSIVE lock.

UPDATE (Upsert Form) Insert Operations and Row-Partitioned Tables

The rules for the INSERT component of the Upsert form of an UPDATE statement for a target row-partitioned table are as follows:

  • The target table can be row partitioned but cannot be column partitioned.
  • The outcome of partitioning expression evaluation errors, such as divide-by-zero errors, depends on the session mode.
    In this session mode … Expression evaluation errors roll back this work unit …
    ANSI request that contains the aborted request.
    Teradata transaction that contains the aborted request.
  • To insert a row, the partitioning expression must produce a partition number that results in a value between 1 and 65,535 (after casting to INTEGER, if it does not have an INTEGER data type).
  • The INSERT clause must specify the same partitioning column values as the UPDATE clause. This also applies when the UPDATE condition specifies conditions based on Period bound functions.
  • A Period column in an upsert request must be a term in an equality condition.
  • If you specify a Period column as part of the partitioning expression, you can only specify equality conditions on that column for an upsert request on that table. If you submit an upsert request that specifies inequality conditions on a Period column that is specified in a partitioning expression for the table, then Teradata Database aborts the request and returns an error.

    However, you can specify a Period column that is not defined as part of a partitioning expression for both equality and inequality conditions on that column for an upsert request.

  • If a partitioning expression specifies a Period bound function, then an equality condition on the bound function is treated as a partitioning bound matching condition.

    If both BEGIN and END bounds are specified in a partitioning expression of the table, then only an equality condition on both bounds is treated as a partitioning bound matching condition. This must result in a single partition.

  • The conditions IS UNTIL_CHANGED and IS UNTIL_CLOSED are treated as equality conditions only for the END bound function.
  • Collation mode has the following implications for upsert operations that insert rows into tables defined with a character partitioning. If the collation for a row-partitioned table is either MULTINATIONAL or CHARSET_COLL and the definition for the collation has changed since the table was created, Teradata Database aborts any request that attempts to insert or update a row in the table and returns an error to the requestor.

UPDATE (Upsert Form) Update Operations and RPPI Tables

The rules for the UPDATE component of the Upsert form of an UPDATE statement in a target row-partitioned table are as follows.

The target table must have a primary index and can be row partitioned but cannot be column partitioned.

You cannot submit an UPDATE request to update the partitioning columns of a table with row partitioning such that a partitioning expression does not result in a value between 1 and the number of partitions defined for that level.

You must specify the same partition of the combined partitioning expression for the INSERT component in the upsert form of an UPDATE request that you specify for the UPDATE component.

You must specify the values of the partitioning columns in the WHERE clause of the upsert form of an UPDATE request.

You also cannot modify the values of those partitioning columns; otherwise, the system aborts the request and returns an error message to the requestor.

You cannot update the system-derived columns PARTITION and PARTITION#L1 through PARTITION#L62.

Expression evaluation errors, such as divide by zero, can occur during the evaluation of a partitioning expression. The system response to such an error varies depending on the session mode in effect at the time the error occurs.

In this session mode … Expression evaluation errors roll back this work unit …
ANSI request that contains the aborted request.
Teradata transaction that contains the aborted request.

Take care in designing your partitioning expressions to avoid expression errors.

The INSERT clause must specify the same partitioning column values as the UPDATE clause. This also applies when the UPDATE condition specifies conditions based on Period bound functions.

If you specify a Period column as part of the partitioning expression, you can only specify equality conditions on that column for an upsert request on that table. An upsert request cannot specify inequality conditions on a Period column that is specified in a partitioning expression for the table.

However, an upsert request can specify a Period column that is not defined as part of a partitioning expression for equality and inequality conditions on that column.

If you specify a Period column in an upsert request, it must be a term in an equality condition.

If the partitioning expression for a target table specifies a Period bound function, then an equality condition on the bound function is treated as a partitioning bound matching condition.

If BEGIN and END bounds are specified in the partitioning expression of the table, only an equality on both bounds is processed as a partitioning bound matching condition. This must result in a single partition.

The conditions IS UNTIL_CHANGED and IS UNTIL_CLOSED are treated as equality conditions only for the END bound function.

UPDATE (Upsert Form) and Join Indexes

For a row inserted into a base table that causes an insert into a row-partitioned join index or an update of an index row in a row-partitioned join index:
  • the partitioning expressions for that join index row cannot evaluate to null, and
  • the partitioning expression must be an expression that is CASE_N or RANGE_N with a result between 1 and 65535 for the row.

Inserting a row into a base table does not always cause inserts or updates to a join index on that base table. For example, you can specify a WHERE clause in the CREATE JOIN INDEX statement to create a sparse join index for which only those rows that meet the condition of the WHERE clause are inserted into the index, or, for the case of a row in the join index being updated in such a way that it no longer meets the conditions of the WHERE clause after the update, cause that row to be deleted from the index.

The process for this is:
  1. The system checks the WHERE clause condition for its truth value after the row insert.
  2. The system evaluates the condition, then does one of the following:
  3. If the condition evaluates to FALSE, the system deletes the row from the sparse join index.
  4. If the condition evaluates to FALSE, the system retains the row in the sparse join index.

You cannot assign either a value or a null to the system-derived columns PARTITION or PARTITION#L1 through PARTITION#L62 in an insert operation.

Collation mode has the following implications for upsert operations that insert rows into tables defined with a character partitioning. If a noncompressed join index with a character partitioning under either an MULTINATIONAL or CHARSET_COLL collation sequence is defined on a table and the definition for the collation has changed since the join index was created, Teradata Database aborts any request that attempts to insert or update a row in the table and returns an error to the requestor whether the insert or update would have resulted in rows being modified in the join index or not.

If the partitioning expression for a table or noncompressed join index involves Unicode character expressions or literals, and the system has been backed down to a release that has a different Unicode character set than the one in effect when the table or join index was defined, you cannot insert or update rows in the table with an upsert operation.

Insert or update a row in a base table that causes an insert into a join index with row partitioning such that a partitioning expression for that index row does not result in a value between 1 and the number of partitions defined for that level.

Insert or update a row in a base table that causes an update of an index row in a join index with row partitioning such that a partitioning expression for that index row after the update does not result in a value between 1 and the number of partitions defined for that level.

Updating a base table row does not always cause inserts or updates to a join index on that base table. For example, you can specify a WHERE clause in the CREATE JOIN INDEX statement to create a sparse join index for which only those rows that meet the condition of the WHERE clause are inserted into the index, or, for the case of a row in the join index being updated in such a way that it no longer meets the conditions of the WHERE clause after the update, cause that row to be deleted from the index.

The process for this activity is as follows:

  1. The system checks the WHERE clause condition for its truth value after the update to the row.
    IF the condition evaluates to … THEN …
    FALSE the system deletes the row from the sparse join index.
    TRUE the system retains the row in the sparse join index and proceeds to stage b.
  2. The system evaluates the new result of the partitioning expression for the updated row.
    IF the partitioning expression … THEN …
    • evaluates to null, or
    • is an expression that is not CASE_N or RANGE_N
    it's result is not between 1 and 65535 for the row.

    The system aborts the request. It does not update the base table or the sparse join index, and returns an error.

    • evaluates to a value, and
    • is an expression that is not CASE_N or RANGE_N
    it's result is between 1 and 65535 for the row.

    The system stores the row in the appropriate partition, which might be different from the partition in which it was stored, and continues processing requests.

UPDATE (Upsert Form) and Subqueries

The Upsert form of UPDATE does not support subqueries.

Queue Tables and UPDATE (Upsert Form)

The best practice is to avoid using the UPDATE statement on a queue table because the operation requires a full table scan to rebuild the internal queue table cache. You should reserve this statement for exception handling.

An UPDATE statement cannot be in a multistatement request that contains a SELECT and CONSUME request for the same queue table.

For more information on queue tables and the queue table cache, see SQL Data Definition Language .

UPDATE (Upsert Form) As a Triggering Action

Triggers invoke the following behavior when fired as the result of an UPDATE (Upsert Form) request:

Trigger Defined on this Action Description
UPDATE Fired in all cases for qualified rows.

No insert triggers are performed.

INSERT Fired only if no row is updated and the INSERT is performed.

If no rows qualify, then no triggers are fired.

Triggers are fired as if the UPDATE and INSERT components in the atomic upsert request are performed as separate statements: unconditionally for UPDATE and conditionally for INSERT. The INSERT is performed only when no rows qualify for the UPDATE.

UPDATE (Upsert Form) As a Triggered Action

UPDATE (Upsert Form) requests are supported as triggered actions. The rules for their use as triggered actions are the same as for any other SQL statement.

Rules and Restrictions

The basic upsert constraints described in Rules for Using Atomic Upsert follow from the conventional definition of upsert functionality.

Unsupported Syntax or Features

This table lists unsupported syntax or features. Attempt to use them causes an error.

Unsupported Syntax Explanation
INSERT … SELECT None.
Positioned UPDATE A WHERE clause cannot use an updatable cursor to do a positioned UPDATE.
UPDATE … FROM None.
UPDATE … WHERE subquery A WHERE clause cannot use a subquery either to specify the primary index or to constrain a non-index column.
UPDATE … SET … primary_index_column You cannot update a primary index column because any change to the rowhash value for a row affects its distribution.
UPDATE … SET … partitioning_column You cannot update a partitioning column because any change to the partitioning for a row affects its location.
UPDATE … SET identity_column … You cannot update GENERATED ALWAYS identity column values.
UPDATE SET PARTITION You cannot update system-derived PARTITION values.

If PARTITION is a user-named column in a table, and not a GENERATED ALWAYS identity column, then it can be updated.

Using UPDATE (Upsert Form) Requests with Scalar Subqueries

The rules for using scalar subqueries with UPDATE (Upsert Form) requests are the same as those for simple UPDATE requests, except for the following additional rule:

  • Because the UPDATE component of an upsert operation must be a simple update, you cannot specify a correlated scalar subquery in its SET clause or its WHERE clause.

See Rules for Using Scalar Subqueries in UPDATE Requests.

Using UPDATE (Upsert Form) Requests on RPPI Tables

The following rules apply to using UPDATE (Upsert Form) requests on row-partitioned tables with a primary index.

  • When an update to a base table row that causes an insert into a join index with row partitioning, a partitioning expression for that index row must result in a value between 1 and the number of row partitions defined for that level. Otherwise, Teradata Database aborts the request and returns an error.
  • When an update to base table row that causes an update of an index row in a join index with row partitioning, a partitioning expression for that index row after the update must result in a value between 1 and the number of row partitions defined for that level. Otherwise, Teradata Database aborts the request and returns an error.

Using UPDATE (Upsert Form) with the DEFAULT Function

The following rules apply to using the DEFAULT function with the upsert form of UPDATE:

  • The DEFAULT function takes a single argument that identifies a relation column by name. The function evaluates to a value equal to the current default value for the column. For cases where the default value of the column is specified as a current built-in system function, the DEFAULT function evaluates to the current value of system variables at the time the request is executed.

    The resulting data type of the DEFAULT function is the data type of the constant or built-in function specified as the default unless the default is NULL. If the default is NULL, the resulting date type of the DEFAULT function is the same as the data type of the column or expression for which the default is being requested.

  • The DEFAULT function has two forms. It can be specified as DEFAULT or DEFAULT ( column_name ) . When no column name is specified, the system derives the column based on context. If the column context cannot be derived, the request aborts and an error is returned to the requestor.
  • All of the rules listed in Rules for Using the DEFAULT Function With Update also apply for an update in the upsert form of the UPDATE statement.
  • All of the rules listed in Inserting When Using a DEFAULT Function also apply for an insert in the upsert form of the UPDATE statement.

See SQL Functions, Operators, Expressions, and Predicates for more information about the DEFAULT function.

UDTs and UPDATE (Upsert Form)

UDT expressions can be used in an upsert form of UPDATE statement in any way that is consistent with the general support of UDTs by INSERT and UPDATE and with existing semantics for UPSERT (see Inserting Into Distinct UDT Columns, Inserting Into Structured UDT Columns, Updating Distinct UDT Columns, and Updating Structured UDT Columns.

The UPDATE statement used in the upsert operation can also use a mutator SET clause if it updates a structured UDT column. See Updating Structured UDTs Using a Mutator SET Clause.

UPDATE (Upsert Form) Support for Load Isolated Tables

A nonconcurrent load isolated update operation on a load isolated table updates the matched rows in-place.

A concurrent load isolated update operation on a load isolated table logically deletes the matched rows and inserts the rows with the modified values. The rows are marked as deleted and the space is not reclaimed until you issue an ALTER TABLE statement with the RELEASE DELETED ROWS option. See SQL Data Definition Language Syntax and Examples.