This section summarizes the benefits, best uses, and restrictions of multivalue compression in Teradata Database.
Benefits of Multivalue Compression
- Minimal performance decrement for compressing or decompressing values
- Reduced I/O for table scans, spooling result sets, and data loads
- Reduced I/O for cached tables because more smaller tables can be cached, eliminating I/O operations on those tables
- Valid for most commonly used data types
- Spools retain the multivalue compression of their base tables
- Can be specified for permanent, global temporary, and volatile table columns
- Automatically compresses nulls
- Easily and complementarily combined with algorithmic compression
- Easily combined with block-level and temperature-based block-level compression and autocompression for column-partitioned tables and join indexes
Best Use of Multivalue Compression
- Very wide columns
- Columns with few distinct values
- Columns in large tables
- Tables having a large number of rows with repeating values
- Tables having a large number of rows with default values
- Columns having a constant distribution of values across time
Restrictions and Limitations of Multivalue Compression
- Requires extensive data analysis prior to implementation
- Can compress a maximum of only 255 different characters per column
- The more values compressed per row, the more presence bits in the row header are required, which increases the row size
- Affects table header size
- Cannot specify for primary index or primary AMP index columns
- Cannot for the partitioning columns of a partitioning expression
- Can specify on secondary index columns, but the column data in the secondary index subtable is not multivalue compressed
- Columns with many unique or nearly unique values cannot benefit from multivalue compression
- A given multivalue compression scheme can become less useful across time if the values in the table change significantly