If a base relation R has a foreign key FK that matches the primary key PK of base relation S, then every value of FK in R must have one of the following properties:
- Equal to the value PK of some tuple in S.
- Wholly null (each value in the attribute set that defines FK is null).
This definition presents an obvious contradiction and is not supported by relational theory. By definition, a candidate key (and the FK in a referential relationship must be a candidate key for its relation) must be a unique identifier for its relation, yet a key that is wholly null contains no values, only markers for values that are missing; therefore, it cannot be unique.
If your enterprise business rules prohibit nulls in foreign key columns, you can enforce a non-null constraint on a column by defining the column with a NOT NULL attribute using either CREATE TABLE or ALTER TABLE.
In other words, a row cannot exist in a table with a (non-null) value for a referencing column if no equal value exists in its referenced column.
This relationship is referred to as referential integrity and the act of ensuring that this rule is enforced is referred to as maintaining referential integrity.
This principle is known as the referential integrity rule, and it is one of the fundamental principles of relational database theory. Besides ensuring database integrity, referential integrity constraints have the added benefit that they are frequently used to optimize join plans when implemented in your physical database design.
The referential integrity rule applies equally, the necessary changes being made, to all candidate keys, not just the key selected to be the primary key for a relation.
The intent of the referential integrity rule is to ensure that if some tuple r in relation variable R references some tuple s in relation variable S, then tuple s must exist.
This prevents you from corrupting the integrity of the database by deleting data that is still being used by another relation.
Entity Integrity and Foreign Key Nulls
Permitting a foreign key to be wholly null seems to contradict the entity integrity rule (see Rules for Primary Keys). This property is, to be sure, somewhat ad hoc, but it can have some practical usefulness. For example, consider the following case: an employee is working for some enterprise but has not yet been assigned to a department. Assuming that Department Number is a FK in the Employee relation, you can readily see that even when all the remaining information required to create an Employee relation tuple for the employee in question is available, you cannot create such a tuple in the Employee relation unless you also allow the FK to be null.
The tuple in question should be updated with the appropriate Department Number data just as soon as it is available, but allowing the null FK permits the enterprise to pay this employee even though she is not yet assigned to a department.
Alternatively, the problem can be avoided entirely with a minor change in the logical design of the database. See Redesigning the Database to Eliminate the Need for Nulls for an example.
As another alternative, you can assign default values to the FK field set and then update with the appropriate Department Number as soon as you know what it is. SQL facilities for assigning default values to columns are described in SQL Data Types and Literals.
Enforcing the Referential Integrity Rule
The relational model states the Referential Integrity rule without providing a mechanism to enforce it. In practice, the rule is enforced through the PRIMARY KEY and FOREIGN KEY clauses in the CREATE TABLE statement. The keyword REFERENCES describes the table and column set in which the primary key image of the foreign key resides.
The REFERENCES WITH NO CHECK OPTION specification explicitly instructs Teradata Database not to enforce referential integrity on the specified relationship. See Foreign Key Constraints for further information.
When referential integrity is enforced, you cannot delete a row from the referenced (parent) table as long as there is a row in the referencing table whose foreign key matches it.
Enforcing Referential Integrity in Teradata
To implement referential integrity (RI) in Teradata, you have three choices, ranked in their order of preference:
- Use the declarative referential integrity constraint checks supplied and enforced by the database software.
- Write your own, site-specific macros, triggers, or stored procedures to enforce RI.
- Enforce constraints through application code.
See “CREATE TABLE” in SQL Data Definition Language Detailed Topics for a description of the problems inherent in attempting to enforce database integrity using nondeclarative constraints.
Definition of a Referencing (Child) Table
The referencing table is referred to as the Child table, and the specified Child table columns are called referencing columns.
Referencing columns should be of the same number and have the same data type as the referenced table key.
Definition of a Referenced (Parent) Table
A child must have a parent, and the referenced table in a PK-FK relationship is referred to as the Parent table. The key columns in the Parent table that see a Child table are called referenced columns.
Since the referenced columns are defined as unique constraints, the referenced column set must be one of the following:
- A unique primary index (UPI), not null
- A unique secondary index (USI), not null
Definition of a Primary Key
With respect to RI, a primary key (more accurately, a candidate key) is a parent table column set that is referred to by a foreign key column set in a child table.
Definition of a Foreign Key
With respect to RI, a foreign key is a child table column set that refers to a primary key column set in a parent table.
Why Referential Integrity Is Important
Referential integrity is a mechanism to keep you from corrupting your database. Suppose you have a table like the following:
Part number and order number, each a foreign key in this relation, also form the composite primary key.
Suppose you were to go the part_number table and delete the row defined by the primary key value 1. The keys for the first and third rows in the order_part table are now parentless because there is no row in the part_number table with a primary key of 1 to support them. Such a situation exhibits a loss of referential integrity.
Now, suppose you had a mechanism to prevent this from happening? If you attempt to delete the row with a primary key value of 1 from the part_number table, the database management system does not allow you to do so. This is the way Teradata Database maintains referential integrity.
Besides data integrity and data consistency, referential integrity has the following benefits:
|Increases development productivity||It is not necessary to code SQL statements to enforce referential integrity constraints because Teradata automatically enforces Referential Integrity by means of declarative RI constraints.|
|Requires fewer programs to be written||All update activities are programmed to ensure that established declarative referential integrity constraints are not violated because Teradata enforces Referential Integrity in all environments: no additional programs are required.|
Referential Integrity Constraints
The combination of the foreign key, the parent key (more accurately, a candidate key), and the relationship between them defined by the referential integrity rule is called the referential integrity constraint.
This is a constraint defined on a table column set (using the CREATE TABLE or ALTER TABLE SQL statements) that represents a referential integrity link between two tables.
A referential integrity constraint is always defined for a foreign key column set in the child table in a relationship.
Note that you cannot use UDT, Period, Geospatial, BLOB, CLOB, or XML columns to define a referential integrity relationship or other database constraint.
See Designing for Database Integrity for additional information about how referential integrity is essential to maintaining the integrity of databases.
Teradata Database provides two other features related to referential integrity constraints: batch referential integrity constraints and referential constraints. The basic differences among the different referential constraint types are summarized in the following table.
|Referential Constraint Type||CREATE TABLE Syntax||Does It Enforce Referential Integrity?||Level of Referential Integrity Enforcement|
For more information, see ANSI Temporal Table Supportand Temporal Table Support.
|REFERENCES WITH NO CHECK OPTION||No||None|
|Batch referential integrity constraint||REFERENCES WITH CHECK OPTION||Yes||All child table rows must match a parent table row|
|Referential integrity constraint||REFERENCES||Yes||Row|
Referential constraints and temporal relationship constraints do not enforce the referential integrity of the database. Instead, they signal the Optimizer that certain referential relationships are in effect between tables, thus providing a means for producing better query plans without incurring the overhead of system enforcement of the specified RI constraints.
You should specify referential constraints and temporal relationship constraints only when you enforce the integrity of the referential relationship in some other way, or if the possibility of query errors and the potential for data corruption is not critical to your application. See Foreign Key Constraints and “CREATE TABLE Column Definition Clause” in SQL Data Definition Language for more information.
Batch referential integrity constraints are externally identical to regular referential integrity constraints. Because they enforce referential integrity in an all-or-none manner for an entire implicit transaction, they can be more high-performing than standard referential integrity constraints. In most circumstances, there is no semantic difference between the two, only an implementation difference. Batch referential integrity becomes important when a single statement inserts many rows into a table, like the massive inserts that can occur with INSERT … SELECT requests.
Example: Referential Integrity Constraints
The following CREATE TABLE statement defines the following referential integrity constraints on table_A.
- A column-level constraint on the foreign key column, column_A1, and the parent table (table_B) key column, column_B1.
- A table-level constraint on the composite foreign key column set (column_A1, column_A2) and the parent table table_C.
CREATE TABLE table_A ( column_A1 CHARACTER(10) REFERENCES table_B (column_B1), column_A2 INTEGER PRIMARY KEY (column_A1), FOREIGN KEY (column_A1, column_A2) REFERENCES table_C);
According to this definition, the single-column primary key column_A1 must reference the primary key column column_B1 of the parent table table_B. Table_B must have a primary key, of which one column is column_B1, having the following definition.
CHARACTER(10) NOT NULL
The composite foreign key (column_A1, column_A2) must reference the primary key of the parent table table_C. Table_C must have a two-column primary key, the first column of which has the following definition.
CHARACTER(10) NOT NULL
The second column of the Table_C primary key must have the following definition.
INTEGER NOT NULL
Rules for Referential Integrity Constraints
Referential integrity constraints must obey the following set of rules:
- The parent key must exist when the referential integrity constraint is defined.
- The parent key columns must be either a unique primary index (UPI) or a unique secondary index (USI).
- The foreign and parent keys must have the same number of columns and their data types must match.
- Duplicate referential integrity constraints are not allowed.
- A foreign key value must be equal to its parent key value or it must be null.
- Self-reference (a condition in which the Parent and Child tables are the same table) is allowed, but the foreign and parent keys cannot consist of identical columns.