Third Normal Form (3NF) deals with the elimination of nonkey attributes that do not describe the primary key (more accurately, 3NF deals with the elimination of nonkey attributes that do not describe any candidate key for the relation).
Definition of Third Normal Form
A relation variable R is said to be in Third Normal Form when it is in 2NF and one of the following statements is also true for every nontrivial functional dependency (assume the functional dependency X → A):
- The attribute set x is a superkey.
- Attribute A is part of some candidate key.
Another way of stating the rule is this: The relationship between any two nonkey attributes or attribute sets (excluding attributes having no duplicates allowed constraints) must not be one-to-one in either direction.
This condition is met when both of the following criteria have been met:
- Every nonkey attribute depends on all attributes of the primary key: the entire primary key.
Again, this is more accurately stated as the entire candidate key.
- No non-key attribute is functionally dependent on another non-key attribute of the relation.
The formal definition for Third Normal Form is as follows: For a relation to be in 3NF, the relationship between any two nonprimary key attributes (more accurately, between any two non-candidate key attributes) or groups of attributes in a relation must not be one-to-one in either direction. In other words, the nonkey attributes are nontransitively dependent upon each other and the key. Having no transitive dependencies in a relation implies no mutual dependencies.
Attributes are said to be mutually independent if none of them is functionally dependent on any combination of the others. This mutual independence ensures that individual attributes can be updated without any danger of affecting any other attribute in a tuple.
This is an incomplete definition for 3NF that fails to account for functional or transitive dependencies. To account for this special case, the definition of 3NF is extended as described in the following section.
Definition of Boyce-Codd Normal Form
When the relational model of database management was originally proposed, it only addressed what are now known to be the first three normal forms. Later theoretical work with the model showed that 3NF required further refinement to eliminate certain update anomalies.
The classic definition for third normal form does not handle situations in which a relation R has multiple composite candidate keys with overlapping attributes such as the following, where CK 1, CK 2, and CK 3 represent overlapping candidate keys on the overlapping attribute composites A 1-A 2, A 2-A 3, and A 3-A 4, respectively:
|A 1||A 2||A 3||A 4||A 5||A 6|
|CK 1||CK 3|
While this situation does not occur frequently, it does present itself from time to time.
Boyce-Codd Normal Form addresses this situation. BNCF reduces to 3NF whenever the special situation that defines this problem does not apply.
A relation is in BCNF if and only if every determinant is a candidate key. This means that only determinants can be candidate keys.
Somewhat more formally, a relation is said to be in BCNF when it is in 2NF and the following is true: if whenever X → A and A does not belong to X, then X is a superkey.
Meaning of Third Normal Form in This Manual
For purposes of this manual, use of the term 3NF generally covers BCNF as well. Relations in BCNF are often awkward to deal with, however, and are usually more tractable if decomposed into other relations in 4NF.
For example, consider the following relations:
If these relations had been designed so that the CustName attribute was in the Order relation, 3NF would have been violated because there would be a one-to-one relationship between two nonkey attributes, CustNum and CustName.
Example: Violating 3NF
Suppose the Order relation had been structured like this:
The potential update anomalies associated with this violation of 3NF are the following:
- Changed customer name.
To change the name of a customer, you must find every occurrence of its customer name.
- Inconsistent customer names.
Adams might be misspelled "Addams" in an occurrence, which would cause the tuple containing the misspelled customer name to be missed in a query having the WHERE predicate customername = Adams.
Worse still, customer number 2 might be labeled Adams in one tuple and Zoller in another.
- Inability to add new customer names unless they have an order placed.
Now consider the following entity, which is in 3NF, but still has some problems:
|Rancho Bernardo||Physical Database Design||1||2000/05/15||A/225|
|El Segundo||Relational Database Modeling Workshop||2||2000/08/15||ES/16-201|
|Rancho Bernardo||Advanced SQL for Teradata Database||1||2000/06/30||A/225|
Note that the buildings determine campus because no two buildings on any of the Teradata campuses have the same abbreviation. As a result, Building/Room → Campus. The relation is not in Boyce-Codd normal form.
Normalize the relation by decomposing it into the two following BCNF relations:
|Rancho Bernardo||Physical Database Design||1||2000/05/15|
|El Segundo||Relational Database Modeling Workshop||2||2000/08/15|
|Rancho Bernardo||Advanced SQL for Teradata Database||1||2000/11/23|