Number of Presence Bits Required to Represent Compressed Values
Because compression is signaled by populating octets of presence bit fields in the row header, there is a tradeoff between the capacity required to add presence octets and the savings realized from multivalue compression at the base table row level: the size of the bit field required to express multivalue compression increases non-linearly as a function of the number of values to be compressed. With respect to the savings gained by multivalue compression, this tradeoff is almost always worth doing.
The following table indicates the relationship between the number of values (and null, which is always compressed by default) specified for compression per column and the number of compress bits required to indicate that number of values.
Nullable? |
Number of Column Values Compressed |
Number of Compress Bits in the Row Header Bit Field |
Yes |
null |
1 Shared with the presence bit. |
No |
1 |
1 |
No |
2 - 3 |
2 |
No |
4 - 7 |
3 |
No |
8 - 15 |
4 |
No |
16 - 31 |
5 |
No |
32 - 63 |
6 |
No |
64 - 127 |
7 |
No |
128 - 255 |
8 |
Yes |
1 + null |
1 |
Yes |
2 - 3 + null |
2 |
Yes |
4 - 7 + null |
3 |
Yes |
8 - 15 + null |
4 |
Yes |
16 - 31 + null |
5 |
Yes |
32 - 63 + null |
6 |
Yes |
64 - 127 + null |
7 |
Yes |
128 - 255 + null |
8 |
The compress bits value represents an index into the compress multivalue array for that column in the table header. This is illustrated by the following graphic showing that the presence bit pattern 10 points to the compressed character string “Los Angeles” in the table header. Note that because only 3 values are compressed for this column, 2 compression bits are required to uniquely identify them.
The table definition DDL for this illustration is as follows:
CREATE TABLE MVCompress (
StreetAddress VARCHAR(40),
City CHARACTER(20) COMPRESS (’New York’,
’Los Angeles’, ’Chicago’) NOT NULL,
StateCode CHARACTER(2));
As you can see, all the possible combinations of bit patterns are used to identify and locate the values. For this particular example, the bit patterns and their corresponding values are those indicated in the following table:
FOR this pattern of presence bits … |
The value for City is … |
AND stored in … |
00 |
not compressed. |
the row. |
01 |
’Chicago’ and is compressed |
the table header. |
10 |
’Los Angeles’ and is compressed |
the table header. |
11 |
’New York’ and is compressed |
the table header. |
From the perspective of the row header, the best policy is to compress the highest number of values in each octet bit array, because doing so does not add any more space in the row header (nor does null compression, which is handled by presence bits in the row header for each row). In other words, the best practice is to optimize multivalued compression for 1, 3, 7, 15, 31, 63, 127, or 255 values. You should evaluate all these possibilities to determine which yields the best compression for a particular column.
You must also take into account the number of bytes compression transfers to the table header, because those bytes count against the maximum row length of 64 KB.
The compressed values are stored adjacent to one another in ascending value order and descending order of field size following the end of Field5 in the table header. See “Table Header Components” on page 711.