Data Compression and Performance - Teradata Database

Teradata Database Administration

Product
Teradata Database
Release Number
15.10
Language
English (United States)
Last Update
2018-10-06
Product Category
Software

Data Compression and Performance

Compression is useful to free up additional disk space, but it may or may not enhance performance depending on the type of compression used, the workload, the frequency of data access, and the system capabilities. Compression generally affects performance as follows:

  • They cycle of decompressing and recompressing required to access and store data may significantly increase CPU usage compared to uncompressed data.
  • Storing data in dense, compressed form may reduce I/O requirements.
  • See “Compression” in Chapter 4: “Space.”

    Multivalue Compression (MVC)

    MVC compresses recurring values within a column into a single value in the table header.

    Performance Impact

    MVC enhances the performance of high data volume applications, like call record detail and click-stream data, and provides significant performance improvement for general ad hoc workloads and full-table scan applications

    Smaller physical row size results in less data blocks and fewer I/Os and improved overall performance, depending upon amount of compression achieved.

    Select and delete operations show a proportionate improvement in all cases. Inserts and updates show mixed results. The load utilities benefit from the compressed values.

    Algorithmic Compression (ALC)

    ALC includes:

  • Teradata-supplied UDFs to compress various types of character data
  • The ability to create custom UDF algorithms
  • Performance Impact

    Teradata-standard ALC UDFs tend to reduce I/O, but can increase CPU usage.

     

    Algorithm

    Performance Effects

  • CAMSET and DECAMSET (Unicode)
  • CAMSET_L and DECAMSET_L (Latin)
  • Relatively low CPU cost for small and medium column widths (higher than UTF-8, lower than ZLIB BLC), but high cost for large column width.

    The decompression cost increases more quickly than UTF-8 and ZLIB as the column width grows, but for short and medium length columns, the difference is not very significant

  • LZCOMP and LZDECOMP (Unicode)
  • LZCOMP_L and LZDECOMP_L (Latin)
  • LZCOMP requires high CPU usage for compression, many times the CPU cost of TRANSUNICODETOUTF8.

    The decompression cost is generally low; somewhat higher than TRANSUNICODETOUTF8 for short and medium length strings, but somewhat lower than TRANSUNICODETOUTF8 for long strings.

  • TRANSUNICODETOUTF8
  • TRANSUTF8TOUNICODE
  • Low CPU cost for both compression and decompression.

    Block-Level Compression

    BLC compresses data at the data block level.

    Because BLC is CPU-intensive, consider the CPU capabilities of the particular Teradata Database platform when using BLC.

    When using BLC on the Teradata appliance systems with very high CPU capabilities, you can:

  • Apply BLC to all allowable types of data.
  • Load and access data at any time because there is enough available CPU to frequently decompress data
  • When using BLC on other platforms with less CPU capacity, you may need to:

  • Load tables during off-peak hours.
  • Limit access to compressed tables during critical throughput periods.
  • Some operations (including queries, insert/updates, ARC dump/restores, reconfig, and CheckTable) can use considerably more CPU while operating on compressed tables. Unless the system is very CPU rich, these operations can impact other workloads and lengthen elapsed response times