Block-Level Compression - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-22
dita:mapPath
qby1588121512748.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantageā„¢

This section summarizes the benefits, best uses, and restrictions of block-level compression in Teradata Database.

Benefits of Block-Level Compression

  • Can apply to data blocks that contain nearly any type of data
  • Requires no analysis of column data prior to implementation
  • Easily combined with multivalue compression
  • More efficient than algorithmic compression for tables with numerous narrow-width columns
  • Has no effect on table header size

Best Use of Block-Level Compression

  • Very large tables
  • Tables that do not require frequent decompression, especially tables that contain rarely updated COLD or static data
  • Tables accessed with low concurrency
  • Systems with low CPU utilization levels
  • Systems where you can use workload management to restrict the overhead of data decompression to a minimal level

Restrictions and Limitations of Block-Level Compression

  • Strictly used to reduce storage space
  • Does not have any performance benefits
  • Can be very costly due to the following factors:
    • Initial cost of applying the compression
    • Ongoing cost of decompressing data blocks when they are accessed
    • Ongoing cost of recompressing data blocks after they have been modified
  • Should not apply for tables that contain WARM or HOT data if they have a CPU utilization rate greater than 80%
  • Extent of compression that can be achieved depends on the characteristics of the data contained within a data block as well as the maximum size that has been specified for permanent data blocks

    Individual data blocks within a table might have different size reduction rates, so data blocks frequently show more size variance after being block-level compressed than they did in their original state

  • Best not used in combination with algorithmic compression because of the combined cost of their decompression
  • Cannot use for data dictionary tables, WAL data blocks, cylinder indexes, table headers, or other internal file structures
  • If either multivalue compression or algorithmic compression is defined on a table, the extent of storage reduction achieved by adding block-level compression is less than if no compression had been present at the time block-level compression was added
  • Achieves the best compression ratio for data blocks defined with the maximum 1 MB size

    Larger maximum data block sizes at the time of compression allow cylinders to be more fully utilized, which can also impact the degree of compressibility.

    As a table matures, it experiences updates and data block splits, which reduce the size of some data blocks.

    During the compression process, Teradata Database reforms data blocks into their maximum size.

    With larger block sizes defined, Teradata Database can achieve a greater degree of block-level compression.

    For performance purposes, the size to make data blocks depends on the application (as it does without BLC). Random access (for example, prime key access and USI access paths) does better with more moderate data block sizes, but sequential workloads (full table scans) work better with larger data blocks.
  • A large number of significantly smaller data blocks can fill the cylinder index, leaving a cylinder with space that cannot be used because some data blocks are considered too small to provide meaningful compression, and Teradata Database bypasses those blocks during the compression process
  • Does not compress secondary index subtables, so if a table has secondary indexes before compression, the size reduction rate after compression appears to be smaller than if the table had no secondary indexes at the time of compression
  • Should not use for small or medium-sized tables that are frequently accessed or updated.