Because compression is signaled by populating octets of presence bit fields in the row header, there is a tradeoff between the capacity required to add presence octets and the savings realized from multivalue compression at the base table row level. The size of the bit field required to express multivalue compression increases nonlinearly as a function of the number of values to compress. Typically, multivalue compression is worth doing.
The following table shows the relationship between the number of values specified for compression in each column and the number of bits required to indicate that number of values. Null is one of the values compressed when multivalue compression is specified for a nullable column, even if null is not explicitly listed in the COMPRESS clause.
| Nullable? | Number of Column Values Compressed | Number of Compress Bits in the Row Header Bit Field |
|---|---|---|
| Yes | null | 1 Shared with the presence bit. |
| No | 1 | 1 |
| No | 2 - 3 | 2 |
| No | 4 - 7 | 3 |
| No | 8 - 15 | 4 |
| No | 16 - 31 | 5 |
| No | 32 - 63 | 6 |
| No | 64 - 127 | 7 |
| No | 128 - 255 | 8 |
| Yes | 1 (including null) | 1 |
| Yes | 2 - 3 (including null) | 2 |
| Yes | 4 - 7 (including null) | 3 |
| Yes | 8 - 15 (including null) | 4 |
| Yes | 16 - 31 (including null) | 5 |
| Yes | 32 - 63 (including null) | 6 |
| Yes | 64 - 127 (including null) | 7 |
| Yes | 128 - 255 (including null) | 8 |
The compress bits value represents an index into the compress multivalue array for that column in the table header. The following graphic shows that the presence bit pattern 10, for example, references the compressed character string 'Los Angeles' in the table header. Because only 3 values are compressed for this column, 2 compression bits are required to uniquely identify them.
The table definition DDL for this illustration is as follows:
CREATE TABLE MVCompress (
StreetAddress VARCHAR(40),
City CHARACTER(20) COMPRESS ('New York',
'Los Angeles', 'Chicago') NOT NULL,
StateCode CHARACTER(2));
All the possible combinations of bit patterns are used to identify and locate the values. For this example, the bit patterns and their corresponding values are those indicated in the following table:
| Presence Bit Pattern | City Value |
|---|---|
| 00 | Not compressed, stored in row. |
| 01 | 'Chicago' , compressed, stored in table header |
| 10 | 'Los Angeles', compressed, stored in table header |
| 11 | 'New York', compressed, stored in table header |
From the perspective of the row header, the best policy is to compress the highest number of values in each octet bit array, because doing so does not add any more space in the row header. That is, the best practice is to optimize multivalued compression for 1, 3, 7, 15, 31, 63, 127, or 255 values. Evaluate these possibilities to determine which yields the best compression for a column.
You must also take into account the number of bytes compression transfers to the table header, because those bytes count against the maximum row length of 1 MB.
The compressed values are stored adjacent to each other in ascending value order and descending order of field size following the end of Field5 in the table header.