Data Compression and Performance
Compression is useful to free up additional disk space, but it may or may not enhance performance depending on the type of compression used, the workload, the frequency of data access, and the system capabilities. Compression generally affects performance as follows:
See “Compression” in Chapter 4: “Space.”
Multivalue Compression (MVC)
MVC compresses recurring values within a column into a single value in the table header.
Performance Impact
MVC enhances the performance of high data volume applications, like call record detail and click-stream data, and provides significant performance improvement for general ad hoc workloads and full-table scan applications
Smaller physical row size results in less data blocks and fewer I/Os and improved overall performance, depending upon amount of compression achieved.
Select and delete operations show a proportionate improvement in all cases. Inserts and updates show mixed results. The load utilities benefit from the compressed values.
Algorithmic Compression (ALC)
ALC includes:
Performance Impact
Teradata-standard ALC UDFs tend to reduce I/O, but can increase CPU usage.
Algorithm |
Performance Effects |
|
Relatively low CPU cost for small and medium column widths (higher than UTF-8, lower than ZLIB BLC), but high cost for large column width. The decompression cost increases more quickly than UTF-8 and ZLIB as the column width grows, but for short and medium length columns, the difference is not very significant |
|
LZCOMP requires high CPU usage for compression, many times the CPU cost of TRANSUNICODETOUTF8. The decompression cost is generally low; somewhat higher than TRANSUNICODETOUTF8 for short and medium length strings, but somewhat lower than TRANSUNICODETOUTF8 for long strings. |
|
Low CPU cost for both compression and decompression. |
Block-Level Compression
BLC compresses data at the data block level.
Because BLC is CPU-intensive, consider the CPU capabilities of the particular Teradata Database platform when using BLC.
When using BLC on the Teradata appliance systems with very high CPU capabilities, you can:
When using BLC on other platforms with less CPU capacity, you may need to:
Some operations (including queries, insert/updates, ARC dump/restores, reconfig, and CheckTable) can use considerably more CPU while operating on compressed tables. Unless the system is very CPU rich, these operations can impact other workloads and lengthen elapsed response times