15.00 - Common Problems With Data Marts - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Common Problems With Data Marts

Prospective builders of data warehouses are frequently advised to “start small” with a data mart and use that kernel to expand gradually into a full blown data warehouse. This approach to warehousing generally leads to failed projects for several reasons.

Sometimes the new data mart is so successful that the configuration is overrun by user demands. The databases grow too large too fast, response times become unacceptably long, and user frustration leads to searching for other ways to get the answers.

The more common reason for failure is that the data mart is immediately unsuccessful because it is designed in such a way that users are unable to retrieve the sort of information they want and need to extract from the data. Databases are highly denormalized to respond to a small set of canned queries; summaries, rather than detail data, comprise the database so that fine-grained exploratory data analysis is not possible; and support for ad hoc queries is either absent or so poor as to discourage users from bothering with them.

The very factors that frequently defeat data mart projects are also the most commonly recommended approaches to designing data marts and data warehouses in the popular data warehousing literature:

  • Denormalization (dimensional modeling)
  • Storing aggregates at the expense of detail data
  • Skewing performance toward a small, preselected set of queries at the expense of all other exploratory analyses
  • Teradata refers to this approach to building a data warehouse as data mart‑centric. Instead of designing a data warehouse from the ground up, the data mart-centric approach begins with one or several highly customized data marts and then attempts to expand the kernel into a full blown data warehouse at some point.