Not long ago, backups were a minor issue in database management. You only had to set up a batch job to perform the backup overnight, while the system was not being used. Multiple issues have complicated that once common scenario.
First is the tremendous increase in the size of databases that can be supported by a relational database management system like Teradata. Databases once had multiple gigabytes. Now multiterabyte databases support large data warehouses, and support for petabyte and even yottabyte databases is coming. Magnetic tape is a serial medium. Even if you archive to multiple tape drives simultaneously, backing up multiterabyte databases takes significant time.
Second is the end of the era of the batch window. Enterprise data warehouses now typically support worldwide operations, and that means that the system must be available to users 24 hours per day, seven days a week. Without the old batch window, how do you back up your business-critical data without having a significant negative effect on the capability of your data warehouse to support its worldwide user base?
This topic describes one method for minimizing the impact of backing up large databases can be designed into your database.
Terminology
For purposes of this topic, large and small tables and databases are defined as follows:
| Term | Definition |
|---|---|
| Small table Small database |
Less than 100 GB. |
| Large table Large database |
Greater than 100 GB. |
Product Issues
The issue described in this topic and its suggested solution apply to the large system environment, primarily when using the NetVault product to backup to, and recovery from, different machines. With the REEL product, you can restore to, or recover from, the same machine using its fast path option, but even that option does not relieve the problem of backing up to a second machine.
The Scenario
- Back up the large table in one pass and then back up the remaining tables in a second pass.
- Back up the entire database in one pass, which takes a long time to complete (and even longer to restore)
A Solution
Design your database so that large tables are isolated within their own, separate databases, then build views for your users to use to access the data. Carefully constructed views can mask this separation and permit users to access data across multiple databases transparently within a single query. The following graphic shows one possible design that uses this method.