15.00 - Validating the Integrity of Base Tables In a Referential Constraint Relationship - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

prodname
Teradata Database
vrm_release
15.00
category
Programming Reference
featnum
B035-1184-015K

Validating the Integrity of Base Tables In a Referential Constraint Relationship

If you decide to specify a non‑enforced Referential Constraint for a table, the responsibility for ensuring or validating the referential integrity of that table is yours alone. The system does not enforce referential integrity when you specify Referential Constraint relationships between tables, because no declarative constraints are declared in the appropriate column definitions.

From the aspect of integrity assurance, the best way to guarantee the referential integrity of a table without taking advantage of a declarative standard or batch referential constraint is to use a procedural constraint such as a set of triggers to handle inserts, updates, and deletions to the tables in the relationship.

For example, you might want to create DELETE/UPDATE triggers on parent tables, and INSERT/UPDATE triggers on child tables to enforce referential integrity. The reasons for preferring declarative constraints over procedural constraints are described briefly in Database Design.

There is the additional likelihood that actively firing triggers will have a greater negative effect on system performance than the simple declarative constraint they are intended to replace.

If you decide not to enforce any form of referential integrity constraint, then you are strongly advised to enforce a set of validation procedures that can detect when and where referential integrity violations occur.

The following scenario uses a basic SQL query to interrogate a table set for violations of the referential integrity rule.

Suppose you create the tables pk_tbl1 and softri_tbl1, with column b2 of softri_tbl1 having a Referential Constraint relationship with column a1 of pk_tbl1. In this relationship, table pk_tbl1 is the parent and table softri_tbl1 is the child. The DDL for defining these tables might look like the following CREATE TABLE requests.

     CREATE TABLE pk_tbl1 (
       a1 INTEGER NOT NULL PRIMARY KEY, 
       a2 INTEGER);
 
     CREATE TABLE softri_tbl1 (
       b1 INTEGER, 
       b2 INTEGER CONSTRAINT softri_1 
          REFERENCES WITH NO CHECK OPTION pk_tbl1(a1));

Column softri_tbl1.b2 is an implicit foreign key referencing the primary key column
pk_tbl1.a1.

Now populate the tables with data as follows.

     INSERT INTO pk_tbl1 (a1, a2) VALUES (11, 111);
     INSERT INTO pk_tbl1 (a1, a2) VALUES (22, 222);
 
     INSERT INTO softri_tbl1 (b1, b2) VALUES (100, 11);
     INSERT INTO softri_tbl1 (b1, b2) VALUES (200, 22);
     INSERT INTO softri_tbl1 (b1, b2) VALUES (300, 33);
     INSERT INTO softri_tbl1 (b1, b2) VALUES (400, 44);

The third and fourth inserts into table softri_tbl1 violate the implicit referential integrity relationship between the tables because the set of distinct values for softri_tbl1.b2 should be identical to the set of values for pk_tbl1 (which, as the primary key value set, constitute a distinct set by default). In this particular scenario, the softri_tbl1.b2 values 33 and 44 identify the rows that violate the implied referential integrity relationship.

The following SELECT request is a generic query to test for this type of corruption. The exclusion of foreign key nulls is included in the query because it is not possible to determine what values they represent.

     SELECT DISTINCT childtable.* 
     FROM childtable, parenttable
     WHERE childtable.fk NOT IN (SELECT pk 
                                 FROM parenttable)
     AND   childtable.fk IS NOT NULL;

The scenario defines the following set of correspondences.

 

Generic Query Element

Specific Query Element

childtable

softri_tbl1

parentable

pk_tbl1

childtable.fk

softri_tbl1.b2

From these correspondences, the specify query to test for corrupt rows in this scenario would look like the following SELECT request.

     SELECT DISTINCT softri_tbl1.* 
     FROM softri_tbl1, pk_tbl1
     WHERE softri_tbl1.b2 NOT IN (SELECT a1 
                                  FROM pk_tbl1)
     AND   softri_tbl1.b2 IS NOT NULL;
 
      *** Query completed. 2 rows found. 2 columns returned.
      *** Total elapsed time was 1 second.
 
              b1           b2
     -----------  -----------
             300           33
             400           44

The report generated by this query returns the rows with the implicit foreign key values 33 and 44, which is what was expected: the set of rows from the child table whose foreign key values do not match any element of the primary key value set from the parent table. This set of non‑matching rows must be deleted from the child table to maintain the referential integrity of the database.

You should perform this validation query regularly, particularly after any substantial updates are made to the tables in the implied relationship.

Keep in mind that your data can be corrupt for the entire interval between the times you run this query, and you still have the burden of repairing any problems that it detects. The query does not solve the problem of failing to enforce referential integrity in your database: all it does is allow you to detect whether the referential integrity defined by 1 primary or alternate key in a parent table and 1 foreign key in a child table has been violated. Each such unenforced referential relationship must be probed regularly to detect violations.