15.00 - Computing and Interpreting the Break-Even Point for Multi-Value Compression - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Computing and Interpreting the Break‑Even Point for Multi-Value Compression

About the Multi-Value Compression Break‑Even Point

The break-even point for multi-value compression is the point at which there is zero net savings as computed by “Equation 7: Net Capacity Usage for Multi-Value Compression With Fallback Not Enabled” or “Equation 8: Net Capacity Usage for Multi-Value Compression With Fallback Enabled.” At this point, the savings in column storage are exactly balanced by the overhead for the compression bits field. In other words, compressing the specified value for a column neither adds to nor subtracts from the net storage required to compress the value.

The following graphic indicates the break-even point for a column for various numbers of distinct values by representing the percentage of compressible rows as a function of the column size width in bytes.

The break-even point for the efficiency of multi-value compression on a column varies as a function of the size of the compressed column. This is obvious. The more data values (larger column size) you can compress, the more storage space you save.

A break-even point represents a boundary condition. If compression of a value or null for a column results in a higher percentage of rows being compressed (expressed as a negative value for net capacity usage) than the break-even point, then it is worth doing; otherwise, it provides no benefit.