Identity Columns, Duplicate Column Values, and Duplicate Rows - Advanced SQL Engine - Teradata Database

SQL Data Definition Language Detailed Topics

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
imq1591724555718.ditamap
dita:ditavalPath
imq1591724555718.ditaval
dita:id
B035-1184
lifecycle
previous
Product Category
Teradata Vantage™

The use of identity columns presents 2 different duplication issues.

  • Duplicate identity column values
  • Unintentionally duplicated rows

Values for identity columns are guaranteed to be unique only when the column is specified using GENERATED ALWAYS … NO CYCLE unless otherwise constrained.

Duplicate identity column values can occur in either of the following situations.

  • The column is specified to be GENERATED ALWAYS, but the CYCLE option is specified.

    In this case, the column can reach its maximum or minimum value and then begin recycling through previously generated values.

  • The column is specified to be GENERATED BY DEFAULT and an application specifies a duplicate value for it.

Duplicate rows can occur in any of the following situations.

  • A previously completed load task is resubmitted erroneously.

    Tables that do not have identity columns, but that are either specified to be SET tables or have at least 1 unique index do not permit the insertion of duplicate rows.

  • Teradata Parallel Data Pump runs without the ROBUST option enabled and a restart occurs.
  • A session aborts, but rows inserted before the abort occurred are not deleted before the session is manually restarted.
In many cases, such rows are not duplicates in the sense defined by the relational model. For example, in the case of a load task mistakenly being run multiple times, the new rows are not considered to be duplicates in the strict relational sense because even though they are the same client row (where they do not have the uniqueness-enforcing identity column value that is defined for them on the server), they have different identity column values on the server and, therefore, are not duplicates of one another.

Suppose, for example, you accidentally load employee Reiko Kawabata into an employee table twice, where the employee_number column is an identity column. After doing this, you have 2 employee rows that are identical except for their different employee_number values. While this is an error from the perspective of the enterprise, the 2 rows are not duplicates of one another because they have different employee_number values. The problem is not with the feature, which works exactly as it is designed to work.

This means that it is imperative for you to enforce rigorous guidelines for dealing with identity column tables at your installation to ensure that these kinds of nebulous duplications do not corrupt your databases.