Compressing Column Values Using Only Multivalue Compression - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Teradata systems use a variety of compression methods. Multivalue and row compression are lossless. The original data can be reconstructed exactly from their compressed forms. Algorithmic compression can be a non-loss or lossy process.

Multivalue compression refers to the storage of column values one time only in the table header, not in the row itself, and pointing to the values using an array of presence bits in the row header. This method of value compression is called Dictionary Indexing. See Multivalue Compression .

You can specify multivalue compression only for columns having one of the following data types.
  • All numeric types
  • DATE
  • CHARACTER and CHARACTER SET GRAPHIC
  • VARCHAR and CHARACTER SET VARGRAPHIC
  • BYTE
  • VARBYTE
Because the list of compressed data values is memory-resident any time the table is being accessed, the system can access compressed values in 2 ways, depending on the presence bits for the column.
  • Using a pointer reference to the value in the current row
  • Using a pointer reference to the value in the multivalue

There is a small cost for compressing a value when a row is stored for the first time, and a one-time cost to convert an existing noncompressed column to a compressed column. But for queries, even those made against small tables, multivalue compression is a net win if the chosen compression reduces the size of the table. See Multivalue Compression Method .

For multivalue-compressed spools, if a column with no expressions applied is copied to spool, Vantage copies only the compressed data into the spool, saving CPU and I/O costs. In the spool, compression works as in a base table. A compressed multivalue list in the table header of the spool is memory-resident while the system operates on the spool.

You can compress nulls and up to 255 distinct values per column using standard default multivalue compression. You can compress the values for an unlimited number of columns per table. Although there is no defined limit on the number of columns that can be compressed per table, the number of columns that can be compressed is a function of the amount of space available in the table header that is not consumed by other optional table characteristics.

Type of Multivalue Compression Description
Single-valued The default is null if no value is specified explicitly.
Multivalued You must specify all values explicitly. There is no default.

There is no essential ordering for specifying values to be compressed for a column.

You cannot specify the same value or NULL more than once in the compression multivalue for a column.

Multivalue compression, once defined for table columns, is retained as a property of spools containing data from the table too, so compression is not only a property of data stored on disk.

Multivalue Compression Capacity

Using multivalue compression to compress a substantial number of values in a table can cause the following capacity issues.

Table Header Overflow

Compressed multivalue information is stored in the table header, and like other database rows, table header rows have a maximum length of 64 KB. Null storage is indicated by presence bits in the row header and has no impact on the size of the table header. The maximum size of a table header is 1 MB. If table header overflow occurs, you must resubmit the CREATE TABLE statement fewer compressed column values.

For additional information about modifying multivalue compressed columns, see the "Adding or Modifying Multivalue Compression for a Column" section in Using Compression Methods with Table Columns.

Multiple factors contribute to the length of table header rows, including partitioning columns, so there are no rigid guidelines to avoid overflow. See Table Headers .

Dictionary Cache Overflow

The system reports a 3930 error. Increase the size of your dictionary cache to the maximum size of 1 MB. The default value for the size of the dictionary cache is also 1 MB, so you only need to increase its size if its size was reduced.