Temperature‑Based Block-Level Compression
Temperature‑based block-level compression uses the temperature of data as maintained by Teradata Virtual Storage to determine what to compress. For example, COLD or WARM data might be compressed automatically, while HOT data might be decompressed automatically if it was previously compressed.
Temperature‑based block‑level compression applies only to permanent user data tables. The following types of tables and file system structures cannot be managed by temperature‑based block‑level compression.
Teradata Virtual Storage tracks the cylinder temperature metrics for all tables when temperature‑based block-level compression is enabled. Temperatures are in effect system‑wide, and Teradata Virtual Storage takes all temperatures into account when it determines which data on a system is associated with which temperature.
Once that determination has been made, the file system judges if a table is using temperature‑based block-level compression and if so, applies selective block‑level compression or decompression based on the cylinder temperatures. For example, when the DBS Control parameter TempBLCThresh is set to COLD, the threshold defined for COLD applies to all tables in the system, so if the cylinders of an AUTOTEMP table are in that COLD temperature band, they are eligible for block-level compression. Similarly, if TempBLCThresh is set to HOT, the cylinders in a AUTOTEMP table would become eligible to be decompressed only if their temperature met or exceeded that threshold.
The file system cannot compress the cylinders of a MANUAL table using either ordinary block‑level compression or temperature‑based block‑level compression. Instead, you must use the Ferret commands COMPRESS or UNCOMPRESS to change the compression status of an existing MANUAL table or subtable manually. You can also use the BlockCompression query band to load rows into new MANUAL tables or subtables using the appropriate block‑level compression.
You can activate temperature‑based block‑level compression for a new MANUAL table by using an ALTER TABLE request to change its BLOCKCOMPRESSION definition from MANUAL to AUTOTEMP. You can also use one of the following TVSTemperature query bands to load data into a new AUTOTEMP table or subtable.
You can also use the Ferret commands COMPRESS and UNCOMPRESS to manually change the state of an AUTOTEMP table. Note that Ferret returns a warning message to the requestor who attempts to do this.
If the block-level compressed state of the data in an AUTOTEMP table does not match its temperature, the block-level compression changes you make using Ferret might be undone by the system over time. Because of this, using the COMPRESS and UNCOMPRESS commands to change the compression status of an AUTOTEMP table is not recommended.
The same set of subtables that can be compressed with ordinary block-level compression are compressed with temperature‑based block-level compression.
There are a number of advantages to managing compression based on the data temperature.
In some cases, you might not even realize that data is no longer being used. Temperature‑based compression automatically locates such data and compresses it.
Temperature‑based compression enables you to compress such partition data automatically.
Temperature‑based block-level compression is dynamically controlled by the AutoTempComp background task. This task does things like the following to cylinders that contain a single or multiple user table or to multiple subtables of a single user table.
Data whose temperature changes from WARM to HOT is also decompressed.
Several DBS Control parameters define boundaries to control AutoTempComp, preventing it from continually compressing and decompressing data that is on a boundary. The TempBLCSpread parameter identifies the minimum spread for such cases. For example, compressing data that is 5% below a defined border between WARM and COLD border data and only decompressing such data when the temperature becomes 5% above the defined border.
The TempBLCThresh parameter defines the temperature at which data is block‑level compressed or decompressed. This enables some sites to set the boundary between COLD and WARM, other sites to set the boundary between WARM and HOT, and other sites to set the boundary for only the defined bottom percentage of the very coldest data.
The BLOCKCOMPRESSION table‑level attribute of the CREATE TABLE and ALTER TABLE statements categorizes tables into 3 groups, as indicated by the following set of keyword options.
If temperature‑based block-level compression is disabled but block-level compression is enabled, Teradata Database treats AUTOTEMP tables the same as MANUAL tables.
Note: You can use the non‑temperature‑related BlockCompression query band to load an empty table with non‑temperature‑related data to compress.
This means that Teradata Database rejects Ferret commands to manually compress table data, but Ferret commands to decompress table data are valid.
A fourth option, DEFAULT, determines whether the block compression for a table should be defined as AUTOTEMP, MANUAL, or NEVER based on the setting of the DefaultTableMode DBS Control parameter. DEFAULT is the default for all new tables and for tables on a system that is being upgraded.
AUTOTEMP tables normally exist in a mixed compression state because temperature‑based block-level compression does not directly apply compression to an entire table all at once. As various cylinders within a table grow warmer or colder over time, the file system either compresses or decompresses those cylinders as is appropriate for their Teradata Virtual Storage temperature.
Cylinders in an AUTOTEMP table become eligible for temperature‑based block-level compression only when they reach, become lower than, or exceed the defined threshold for the specified TempBLCThresh option. For all of the data in a table to be block compressed (or decompressed) at once, Teradata Virtual Storage must become aware that all cylinders in the table have reached or exceeded the threshold for the specified TempBLCThresh option. This would only occur if all of the cylinders in the table had not been accessed in some time and had been classified as COLD with respect to all other cylinders for compression or as HOT for decompression. Because of this, the best practice is not to use temperature‑based block‑level compression for a table that you think requires compression consistency for the entire table.
As COLD disk cylinders are compressed and consolidated across time, each cylinder can contain a greater number of data blocks. Because of this, fewer cylinders are needed to hold the compressed data. This causes additional cylinders to be categorized as COLD that might not have been considered COLD initially, and the number of allocated cylinders decreases while the number of free cylinders increases.
All these factors cause increasingly more data blocks to be stored in a compressed form, to the point where all cylinders considered by Teradata Virtual Storage to be COLD have been compressed.
When a table is in a state of mixed block‑level compression, all of its data remains accessible and can be read or written to as required without incurring any notable performance issues. The only conclusions you should draw about a table with mixed block-level compression are that they do not have the most efficient space utilization at the levels of file system space or database space and the performance of read and update operations on them are often unpredictable. For those tables over which you want to have total control, MANUAL is probably the best block-level compression option.
Because the Teradata Virtual Storage temperature of data is a relative value, if you access 50% of your data daily and 50% every other day, the second group of data becomes classified as COLD even though it is accessed fairly often, and compressing data based on its temperature might compress this COLD data. You might want to mark a table such as this one as being managed manually to avoid unwanted compression.
To use temperature‑based block-level compression, you should set the following DBS Control parameters as noted. When a setting is site‑dependent, its setting is given as optional.
This parameter must also set ON to enable temperature‑based block-level compression.
The best practice is to modify the option specified for the BLOCKCOMPRESSION attribute for individual tables to AUTOTEMP when required.
For more information on these parameters, see the Utilities: Volume 1 (A-K).
The permanent table data used most often remains in-memory for faster access. The data stays in VERYHOT cache until other data is used more often and replaces it in the cache. For more information manually setting the data temperature of cylinders so that they stay in-memory, see Database Administration.