Tradeoffs Between Multivalue Compression and Storage Requirements for Compressed Values - Teradata Vantage - Analytics Database

Database Design

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ogg1628096130566.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
zqc1472244571611
lifecycle
lifecycle
Product Category
Teradata Vantageā„¢

While multivalue compression removes specified values from row storage, those values do not disappear: they must be stored somewhere. This statement applies only to values, not nulls. Null compression is handled by the presence bits in the row header and does not have an impact on the table header.

Storage of Compressed Values

The presence bits in the row header index into field 5 of the table header, where the compressed values are stored, once per column per AMP. This does not apply to algorithmically compressed data, which is stored in place within the row except for algorithmically compressed BLOB, BLOB-related UDT, CLOB, CLOB-related UDT, XML, XML-related UDT, or Geospatial data that is generally stored in subtables. See Teradata Vantageā„¢ - SQL External Routine Programming, B035-1147 for more information about algorithmic compression.

Because the size of the table header is limited to 1 MB, there is a limit to how many bytes can be compressed for a given column. If the number of bytes compressed exceeds the maximum row length, then the CREATE or ALTER TABLE statement used to create the new table is not valid and the DDL statement aborts. This is true even if the number of values specified for compression does not exceed the upper limit of 255.

The following graph plots the number of compressible values that can be specified for a column as a function of column width.



Not surprisingly, the plot clearly indicates that the wider the column, the fewer the number of values that can be compressed for the column. Particularly for wider columns, this means that in order to optimize compression, you must carefully analyze your tables to determine which values occur the most frequently and then limit compression to the top n values from that list.