15.00 - Chapter 13 Designing for Missing Information - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

The efficacy and usefulness of the manner in which missing information is handled by SQL is in the eye of the beholder. Perhaps no other topic in relational database management generates as much controversy as does the dispute over the proper way to deal with missing information in relational databases. Independent of the semantic difficulties SQL presents when dealing with nulls, missing information also makes exploratory data analysis far more difficult than the identical data mining operation performed on data free of missing values.

The problems resulting from recording and manipulating missing information are concerned principally with the sometimes ambiguous, and often counterintuitive, meaning of nulls as reported by various types of database queries.

This chapter examines the various semantic ambiguities with nulls in order to make designers aware of the potential problems of interpretation they present. The principal thrust is to make you aware of the inconsistencies in the way SQL handles missing information. Careful use of nulls can provide benefits that cannot otherwise be achieved: at the same time, their careless use can present serious problems not only for the integrity of your databases, but also for the accuracy of the information you retrieve from them.

The recommendation for nulls is not to use them if you can avoid doing so. Constrain as many of your columns as possible to be NOT NULL. A carefully considered database design is often all that is required to avoid recording most nulls.