Capacity Planning
When there is no legacy database to build on, capacity planning can be a difficult enterprise to undertake. Fortunately, this is rarely an issue for contemporary users because at least part of their corporate databases are almost always maintained in electronic form. Keep in mind that much of the information presented in this chapter assumes you have a legacy system to draw upon for making your sizing estimations.
Capacity planning should begin with the idea of making the most frequently accessed data available at all times. With the relatively low priced, large capacity disk storage units commonly used for data warehousing applications, the nature of the emphasis on this factor has changed from offloading as much historical data as possible to archival storage toward developing the capability of keeping all data forever online and accessible to the warehouse.
In a data warehouse that maintains massive quantities of history data, the volume of data is typically inversely proportional to its use. In other words, there is an enormous amount of cool history data that is accessed lightly, and a relatively lesser volume of hot and warm data that is accessed frequently.
The following, somewhat loose, default definitions apply to the commonly described temperature bands.
Temperature |
Default Definition |
COLD |
The 20% of data that is least frequently accessed. |
WARM |
The remaining 60% of data that falls between the COLD and HOT bands. |
HOT |
The 20% of data that is most frequently accessed. |
VERY HOT |
Data that you or Teradata Virtual Storage think should be added to the Very Hot cache list and have its temperature set to very hot when it is loaded using the TVSTemperature query band. |
Note: Teradata Virtual Storage tracks data temperatures at the level of cylinders, not tables. Because the file system obtains its temperature information from Teradata Virtual Storage, it also handles temperature‑related compression at cylinder level. See Teradata Virtual Storage for more information about this.
The file system can change the compressed state of the data in an AUTOTEMP table at any time based on its temperature. Cylinders in an AUTOTEMP table become eligible for temperature‑based block‑level compression only when they reach or fall below the threshold defined for COLD temperature‑based block level compression. See “TempBLCThresh” in Utilities for more information about the temperature settings that you can use for temperature‑based block‑level compression.
Temperature‑based thresholds for the block‑level compression of AUTOTEMP tables work as defined by the following table.
IF data blocks are initially … |
AND then become … |
THEN the file system … |
block‑level compressed |
warmer than the defined threshold for compression |
decompresses them. |
not block‑level compressed |
colder than the defined threshold for decompression |
compresses them. |
For tables that are not defined with BLOCKCOMPRESSION=AUTOTEMP, you must control their block‑level compression states yourself using Ferret commands or, if a table is not populated with rows, you can use one of the TVSTemperature query bands to specify the type of block‑level compression to use for the newly loaded rows. If temperature‑based block-level compression is disabled but block-level compression is enabled, Teradata Database treats AUTOTEMP tables the same as MANUAL tables.
For all of the data in a table to be block compressed or decompressed at once in an AUTOTEMP table, Teradata Virtual Storage must become aware that all cylinders in the table have reached the threshold specified by the DBS Control parameter TempBLCThresh. This would occur in the following case. Suppose the threshold value for TempBLCThresh is set to WARM.
IF all of the cylinders in the table … |
THEN they all become eligible for … |
reach or fall below the WARM or COLD thresholds |
block‑level compression. |
reach or exceed the HOT or VERY HOT thresholds |
decompression. |
Because of this, the best practice is not to use the AUTOTEMP option, or not to use any form of temperature‑based block‑level compression for a table that you think requires compression consistency for the entire table.