15.00 - Semantics of Nulls in the Outer Join - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Semantics of Nulls in the Outer Join

The motivation behind the outer join is to preserve information that is otherwise lost in an inner join. There can be no doubt that this is an important issue, but the SQL solution to the problem presents some inconsistencies regarding the semantics of the nulls reported by an outer join. The following contrast of simple inner and outer natural joins on the same two tables is based on an example developed by Date (1992c). The respective joins are between the following supplier and supplier_parts tables:

 

supplier

 

supplier_parts

suppl_num

suppl_name

status

city

 

suppl_num

part_num

quantity

PK

 

 

 

 

PK

 

S2

Jones

10

Paris

 

S2

P1

300

S5

Adams

30

Athens

 

S2

P2

400

Here is the result of the inner join of the supplier and supplier_parts tables.

 

This join outcome has no information about supplier S5, therefore, it is said to have lost that information.

Here is the result of the natural outer join of the supplier and supplier_parts tables:

 

This join outcome retains information about supplier S5, therefore it is said to preserve that information. Note that the result reports nulls for the part_num and quantity columns for supplier S5.

Think back to the definition of null semantics that was presented in “Semantics of SQL Nulls” on page 673. Recall that the unequivocal definition for an SQL null is that it represents an unknown value. In this case, however, the values for both part_num and quantity for supplier S5 are known, and those values are both the empty set. Because SQL does not support empty sets directly, it substitutes nulls in their place. This is also true for the result of the length determination of an empty character string (see “Inconsistencies in How SQL Treats Nulls” on page 674).

The important thing to remember with respect to the semantics of outer join nulls is that nulls represent both the empty set (when used to extend the inner join to preserve information that would otherwise be lost) and semantically correct SQL nulls (when used to represent unknown information in the inner join portion of the result). The determination of which null is which is left to the user.

Note that in set theory, the empty set is sometimes referred to as the null set. It is extremely important to understand that the word null in this context means something entirely different from its meaning in the SQL language. The null set is simply a set that contains no members. It is, therefore, empty, which is why this set is also referred to as the empty set.

Also note that there is only one empty set in set theory, analogous to how there is only one value of 3 in the set of real numbers, only one value of 3 in the set of cardinal numbers, and so on. Therefore, it is referred to as the empty set, not an empty set. Note, too, that studies of the properties of the empty set are sometimes referred to as nullology. Again, this term has absolutely nothing to do with nulls as they are defined by the SQL language.