15.00 - Denormalized Views Versus Physical Denormalization of the Database Schema - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Denormalized Views Versus Physical Denormalization of the Database Schema

A fully-normalized database schema accessed through denormalized views is far more flexible in creating and managing dimensional consolidation hierarchies for OLAP models than a star schema.

Note that Ross and Kimball do not address dimensional views as a solution to resolving the differences between normalized and dimensional database design. Their “hybrid approach” is conceptually similar, and leads to the identical outcome, but it is explicitly redundant, requiring two separate physical databases to implement. The dimensional view approach, by using one physical database, eliminates both the additional costs and the data quality problems of maintaining redundant data.

DBAs who deal with denormalized methods typically define a single set of mutually exclusive dimensions for a given OLAP model. However, it is possible to create many nonexclusive dimensional paths for an individual OLAP model that also are reusable across many different OLAP models.

The following figure shows two denormalized virtual schemas that share a number of common dimensional views:

The number of virtual schemas that can be created and aligned with particular users or user communities is unlimited. As DBAs become more sophisticated in their application of this concept, they can manage an extensive number of database views tailored toward the specific needs of individual users as well as the specific needs of various user groups. It is not possible to maintain such an extensive schema management program when the data must be physically propagated for performance reasons.

Consider another example: suppose there are eight potential dimensional consolidation paths that could arise from the following three entities:

  • Households
  • Customers
  • Accounts
  • In OLAP terminology, a consolidation path is a set of hierarchically related data. Consolidation itself is the process of aggregating the hierarchical data to form subtotals. The highest level in the consolidation path is the dimension for the data.

    The eight possible consolidation paths for this data are the following, where   indicates a downward path through a hierarchy and in E-R terminology is equivalent to the word have:

  • Households
  • Customers
  • Accounts
  • Households   Customers   Accounts
  • Households   Accounts   Customers
  • Households   Customers
  • Households   Accounts
  • Customers   Accounts
  • When either these consolidation paths or the previously mentioned schemas are created virtually, any problems with managing referential integrity are rendered moot because the only data that must be maintained is in the base tables. However, when these tables are instead instantiated as separate physical tables, managing referential integrity across the multiple alternate dimensional paths becomes very difficult to maintain. As more dimensional entities and potentially hundreds more dimensional consolidation paths are added to the schema, it becomes impossible to continue to propagate data physically. As a result, if a physical star schema is implemented, compromises to the maintenance of the semantic integrity of the data must be made to accommodate the resulting complexity.