Number of Presence Bits Required to Represent Compressed Values - Teradata VantageCloud Lake

Lake - Database Reference

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ohi1683672393549.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
ohi1683672393549

Because compression is signaled by populating octets of presence bit fields in the row header, there is a tradeoff between the capacity required to add presence octets and the savings realized from multivalue compression at the base table row level. The size of the bit field required to express multivalue compression increases nonlinearly as a function of the number of values to compress. Typically, multivalue compression is worth doing.

The following table shows the relationship between the number of values specified for compression in each column and the number of bits required to indicate that number of values. Null is one of the values compressed when multivalue compression is specified for a nullable column, even if null is not explicitly listed in the COMPRESS clause.

Nullable? Number of Column Values Compressed Number of Compress Bits in the Row Header Bit Field
Yes null 1

Shared with the presence bit.

No 1 1
No 2 - 3 2
No 4 - 7 3
No 8 - 15 4
No 16 - 31 5
No 32 - 63 6
No 64 - 127 7
No 128 - 255 8
Yes 1 (including null) 1
Yes 2 - 3 (including null) 2
Yes 4 - 7 (including null) 3
Yes 8 - 15 (including null) 4
Yes 16 - 31 (including null) 5
Yes 32 - 63 (including null) 6
Yes 64 - 127 (including null) 7
Yes 128 - 255 (including null) 8

The compress bits value represents an index into the compress multivalue array for that column in the table header. The following graphic shows that the presence bit pattern 10, for example, references the compressed character string 'Los Angeles' in the table header. Because only 3 values are compressed for this column, 2 compression bits are required to uniquely identify them.

The table definition DDL for this illustration is as follows:

     CREATE TABLE MVCompress (
       StreetAddress VARCHAR(40),
       City          CHARACTER(20) COMPRESS ('New York',
                     'Los Angeles', 'Chicago') NOT NULL,
       StateCode     CHARACTER(2));


All the possible combinations of bit patterns are used to identify and locate the values. For this example, the bit patterns and their corresponding values are those indicated in the following table:

Presence Bit Pattern City Value
00 Not compressed, stored in row.
01 'Chicago' , compressed, stored in table header
10 'Los Angeles', compressed, stored in table header
11 'New York', compressed, stored in table header

From the perspective of the row header, the best policy is to compress the highest number of values in each octet bit array, because doing so does not add any more space in the row header. That is, the best practice is to optimize multivalued compression for 1, 3, 7, 15, 31, 63, 127, or 255 values. Evaluate these possibilities to determine which yields the best compression for a column.

You must also take into account the number of bytes compression transfers to the table header, because those bytes count against the maximum row length of 1 MB.

Systems using small cylinders have a maximum row length of 64 KB.

The compressed values are stored adjacent to each other in ascending value order and descending order of field size following the end of Field5 in the table header.