Algorithmic Compression | Database Design | Teradata Vantage - Algorithmic Compression - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
kko1591750222108.ditamap
dita:ditavalPath
kko1591750222108.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantage™

Vantage software includes several standard compression algorithms, in the form of UDFs, which you can use to compress many types of data by table column. You can also create custom compression and decompression algorithms in UDF format.

When column values are mostly unique, algorithmic compression (ALC) may provide better compression results than MVC. When columns have repeated values, you can use ALC and MVC concurrently on the same column, but the system does not apply ALC to any value covered by MVC.

ALC generally functions best on cold (seldom used) data because of the amount of CPU required for decompress/recompress when compressed data is accessed, although some ALC algorithms require less CPU than others. ALC is considered to be the most difficult to implement of all the compression methods.

You can use algorithmic compression to compress table columns with the following data types:
  • ARRAY
  • BYTE
  • VARBYTE
  • BLOB
  • CHARACTER
  • VARCHAR
  • CLOB
  • JSON, with some restrictions listed below
  • DATASET, with some restrictions listed below
  • TIME and TIME WITH TIME ZONE
  • TIMESTAMP and TIMESTAMP WITH TIME ZONE
  • Period types
  • Distinct UDTs, with some restrictions listed below
  • System-defined UDTs, with some restrictions listed below

Procedure

  1. Specify the name of an ALC compression UDF in the COMPRESS USING phrase for a column definition in a CREATE TABLE or ALTER TABLE statement, for example:
    CREATE TABLE  table_name 
      (ItemNo INTEGER,
       Gem CHAR(10) UPPERCASE,
       Description VARCHAR(1000)
          COMPRESS USING TD_SYSFNLIB.compression_UDF_name 
          DECOMPRESS USING TD_SYSFNLIB.decompression_UDF_name);
  2. Specify the name of an ALC decompression UDF in the DECOMPRESS USING phrase for the column definition.

Usage Notes

  • Specify only one set of compression/decompression UDFs for a particular column.
  • The system automatically compresses nulls when you specify COMPRESS.
Teradata standard compression UDFs are located in the Teradata system function library, TD_SYSFNLIB.

For information about functional details, usage requirements, and restrictions on compression and decompression functions, see Teradata Vantage™ - SQL Operators and User-Defined Functions, B035-1210.

Restrictions

These restrictions apply:
  • You cannot use ALC to compress columns that have a data type of structured UDT.
  • The TD_LZ_COMPRESS and TD_LZ_DECOMPRESS system functions compress all large UDTs including UDT-based system types such as Geospatial, XML, and JSON. However, if you write your own compression functions, the following restrictions apply:
    • Custom compression functions cannot be used to compress UDT-based system types (except for ARRAY and Period types).
    • Custom compression functions cannot be used to compress distinct UDTs that are based on UDT-based system types (except for ARRAY and Period types).
  • You cannot write your own compression functions to perform algorithmic compression on JSON type columns. However, Teradata provides the JSON_COMPRESS and JSON_DECOMPRESS functions that you can use to perform ALC on JSON type columns.
  • You cannot write your own compression functions to perform algorithmic compression on DATASET type columns. However, Teradata provides the SNAPPY_COMPRESS and SNAPPY_DECOMPRESS functions that you can use to perform ALC on DATASET type columns.
  • You cannot use ALC to compress temporal columns:
    • A column defined as SYSTEM_TIME, VALIDTIME, or TRANSACTIONTIME.
    • The DateTime columns that define the beginning and ending bounds of a temporal derived period column (SYSTEM_TIME, VALIDTIME, or TRANSACTIONTIME).

    You can use ALC to compress Period data types in columns that are nontemporal; however, you cannot use ALC to compress derived period columns.

    For more information about temporal tables, see Teradata Vantage™ - Temporal Table Support, B035-1182 and Teradata Vantage™ - ANSI Temporal Table Support, B035-1186.

  • You cannot specify multivalue or algorithmic compression for a row-level security constraint column.

You can apply algorithmic compression to referential integrity columns.

Depending on the implementation, algorithmic compression can be either physical or logical, though most implementations use physical data compression. Algorithmic compression can be either lossy or lossless, depending on the algorithm used.

Using Custom ALC Algorithms

You can implement a custom compression algorithm as a scalar external UDF, and then specify the UDF in the COMPRESS USING phrase of a column definition for a CREATE TABLE or ALTER TABLE statement, for example:

CREATE TABLE  table_name 
  (ItemNo INTEGER,
   Type CHAR(10) UPPERCASE,
   Description VARCHAR(1000)
      COMPRESS USING  compression_UDF_name 
        DECOMPRESS USING  decompression_UDF_name);
If you create custom ALC UDFs, test them thoroughly. If the compression or decompression algorithm fails, compressed data may be corrupted or may not be recoverable.

Related Information

For information on... See...
specifying a UDF in the COMPRESS USING phrase of a CREATE TABLE or ALTER TABLE statement
  • Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144
  • Teradata Vantage™ - SQL Data Definition Language Detailed Topics, B035-1184
rules for using compression/decompression algorithms in UDFs Teradata Vantage™ - SQL External Routine Programming, B035-1147
Topic Reference
System-defined external UDFs for algorithmic compression and decompression Teradata Vantage™ - SQL Operators and User-Defined Functions, B035-1210
How to code scalar UDFs to perform algorithmic compression on column data Teradata Vantage™ - SQL External Routine Programming, B035-1147
How to create the SQL definition for an algorithmic compression UDF CREATE FUNCTION (External Form) information in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144
How to specify those scalar UDFs in a table definition ALTER TABLE and CREATE TABLE information in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144
Guidelines for selecting an algorithmic compression functions https://downloads.teradata.com/extensibility/articles/selecting-an-alc-compression-algorithm
Evaluating algorithmic compression UDFs Download the test suite from https://downloads.teradata.com/download/extensibility/algorithmic-compression-test-package