15.00 - Advantages of Normalization for Physical Database Implementation - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Advantages of Normalization for Physical Database Implementation

Teradata Database is the only commercially available database management system that is capable of physically implementing a fully-normalized logical model. Independent data warehousing consultant Neil Raden has written, “…3NF designs don’t support query and analysis […] The only routine exception to this is Teradata implementations: Because of the unique characteristics of the massively parallel architecture and database (sic) optimizer, Teradata can process analytical SQL against a 3NF schema with acceptable performance”. Note that relational query optimizers optimize SQL requests against databases, not the databases themselves.

The following list summarizes the advantages of physically implementing a normalized logical model for Teradata Database:

  • Greater number of relations
  • More primary index choices
  • Optimal distribution of data
  • Fewer full‑table scans
  • For example, consider the multicolumn primary index of a Fact table, which has the primary key of each of its dimension tables as a component. The Optimizer cannot retrieve a row with a partial primary index, so many, if not all, accesses to the Fact table must use a full‑table scan.

    This assumes that you implement the natural primary key of the fact table as the primary index. If you instead define a surrogate key column to be the primary index, the full‑table scan issue is moot (see the definition for “Surrogate key” under “Definitions” on page 75 and “Identity Columns” on page 818). This practice is not generally advised.

  • More joins possible
  • Enhanced likelihood the Optimizer will use the high-performing merge or nested join methods
  • Optimal data separation to eliminate redundancy from the database
  • Optimal control of data by eliminating update anomalies
  • Fewer columns per row
  • Optimal application separation
  • Optimal control of data
  • Smaller rows
  • Optimal data blocking by the file system
  • Reduced transient and permanent journal space
  • Reduced physical I/O