15.00 - Adding, Modifying, or Dropping Algorithmic Compression - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

prodname
Teradata Database
vrm_release
15.00
category
Programming Reference
featnum
B035-1184-015K

Adding, Modifying, or Dropping Algorithmic Compression

The following rules apply to using an ALTER TABLE request to add, modify, or drop algorithmic compression for a column.

  • You can use an ALTER TABLE request to add a new table column that has only algorithmic compression or that has both value and algorithmic compression.
  • You can use an ALTER TABLE request to modify the algorithmic compression characteristics of an existing column only if the table contains no data.
  • You cannot use an ALTER TABLE request to add algorithmic compression for a column that has a structured UDT data type, including a column that use embedded services UDFs for compression.
  • When you specify a UDF name for algorithmic compression, you must also specify its containing database or user.
  • You cannot specify algorithmic compression for a VALIDTIME or TRANSACTIONTIME column that has a Period data type.
  • If a table is populated with data, you cannot modify its column compression attributes if either the current or planned compression attribute specifies algorithmic compression.
  • The following table summarizes the supported compression modification cases.

     

           Current Compression

     

     

    Modified Compression

         None

    Multi-value Only

    Algorithmic Only

    Multi-value and Algorithmic

    None

           yes

                yes

                 no

                no

    Multi-value Only

           yes

                yes

                 no

                no

    Algorithmic Only

           no

                 no

                 no

                no

    Multi-value and Algorithmic

           no

                 no

                 no

                no

  • You must specify both a compression algorithm and a decompression algorithm for any column specified to have algorithmic compression.
  • There is no essential ordering to specifying both multi-value compression and algorithmic compression for a column: you can specify them in any order.

  • When you specify both multi-value compression and algorithmic compression for the same column, Teradata Database applies algorithmic compression only to values that are not specified with multi-value compression.
  • If you specify only algorithmic compression for a column, there is no limit on the size of the specified data type.
  • If you specify both algorithmic and multi-value compression for a column, the size of the data type is restricted to the same data type limitations as those for multi-value compression.
  • Teradata Database support for algorithmic compression is restricted to the following data types.
  • BLOB
  • BYTE
  • CHARACTER
  • Teradata Database implements the GRAPHIC data type as CHARACTER CHARACTER SET GRAPHIC.

  • CLOB
  • Geospatial
  • VARBYTE
  • VARCHAR
  • VARGRAPHIC
  • JSON
  • XML
  • Period
  • All distinct BLOB‑based, CLOB‑based, and XML‑based UDT types.
  • You can also compress nulls for data in the following types of columns.

  • Distinct UDT
  • ARRAY/VARRAY
  • Period types, but not derived Period types
  • You cannot specify algorithmic compression for a column that is a component of the primary index for a table.
  • You can specify algorithmic compression for a column that is a component of a secondary index for a table.
  • You cannot specify algorithmic compression for a column in a standard referential integrity relationship.
  • You can specify algorithmic compression for a column in a Batch referential integrity relationship and for a column that is a component of a Referential Constraint.
  • You can specify algorithmic compression for columns in a permanent base table and for columns in a global temporary table.
  • Teradata Database automatically compresses nulls in a column that specifies algorithmic compression.
  • The dictionary table column DBC.TVFields.CompressValueList contains the names of the algorithmic compression and algorithmic decompression UDFs specified for a column as well as the compression multi-value for the column if it specifies both algorithmic and multi-value compression.
  • If the size of CompressValueList for a table column exceeds 8,192 characters, Teradata Database aborts the ALTER TABLE request and returns an error to the requestor.

  • For the rules that apply to writing UDFs to support algorithmic compression and decompression, see SQL External Routine Programming.
  • Teradata provides some external UDFs for algorithmic compression and decompression as listed in the following table.
  •  

    UDF Name

    Function

    TransUnicodeToUTF8

    Compresses the specified Unicode character data into UTF‑8 format.

    The result data type is VARBYTE(64000).

    This is useful when the characters are in the ASCII script (U+0000 to U+007F) because UTF‑8 uses one byte to represent these characters and Unicode (UTF-16) uses two bytes.

    TransUnicodeToUTF8 can only compress characters from the ASCII script of Unicode (U+0000 to U+007F).

    TransUTF8ToUnicode

    Uncompresses data that was compressed using TransUnicodeToUTF8.

    The result data type is VARCHAR(32000) CHARACTER SET UNICODE.

    LZCOMP

    Compresses the specified Unicode character data using Lempel-Ziv coding.

    The result data type is VARBYTE(64000).

    LZDECOMP

    Uncompresses Unicode data that was compressed using LZCOMP.

    The result data type is VARCHAR(32000) CHARACTER SET UNICODE.

    LZCOMP_L

    Compresses the specified Latin character data using Lempel-Ziv coding.

    The result data type is VARBYTE(64000).

    LZDECOMP_L

    Uncompresses Latin data that was compressed using LZCOMP_L.

    CAMSET

    Compresses the specified Unicode character data into partial byte 4‑bit values for numeric characters or partial 5‑bit values for alphabetic characters using a proprietary Teradata algorithm and an embedded services function.

    The result data type is VARBYTE(64000).

    You should not use this UDF to compress CHARACTER SET GRAPHIC character data.

    DECAMSET

    Uncompresses the Unicode character data that was compressed using CAMSET.

    The result data type is VARCHAR(32000) CHARACTER SET UNICODE.

    CAMSET_L

    Compresses the specified Latin character data in the range U+0000 to U+00FF into partial byte 4‑bit values for numeric characters or partial 5‑bit values for alphabetic characters using a proprietary Teradata algorithm and an embedded services function.

    The result data type is VARBYTE(64000).

    You should not use this UDF to compress CHARACTER SET GRAPHIC character data.

    DECAMSET_L

    Uncompresses the Latin character data that was compressed using CAMSET_L.

    The result data type is VARCHAR(64000) CHARACTER SET LATIN.

    TD_LZ_COMPRESS

    Compresses the specified ALC-supported data type or predefined type data using Lempel‑Ziv coding.

    Valid column data types include distinct UDTs, distinct BLOB‑, CLOB‑, and XML‑based UDTs, ARRAY, VARRAY, Period, CHARACTER, VARCHAR, BLOB, CLOB, XML, JSON, Geospatial, BYTE, and VARBYTE.

    For a column that specifies both multi-value compression and algorithmic compression, only distinct UDTs based on predefined types that are valid for multi-value compression can be algorithmically compressed using TD_LZ_COMPRESS. This includes all numeric types, Period types, BLOB, CLOB, XML, distinct BLOB‑based UDTs, distinct CLOB‑based UDTs, distinct XML‑based UDTs, Geospatial, DATE, CHARACTER, VARCHAR, BYTE, and VARBYTE types.

    The parameter data type of TD_LZ_COMPRESS must match the data type of TD_LZ_DECOMPRESS exactly.

    The result data type must match the data type of the UDT or predefined type column exactly.

    TD_LZ_DECOMPRESS

    Uncompresses data that was compressed using TD_LZ_COMPRESS.

  • For algorithmic compression, the return data type of TD_LZ_DECOMPRESS must match the data type of the compressed column exactly.
  • For algorithmic compression, the VARBYTE return length for TD_LZ_COMPRESS must match the length of the TD_LZ_DECOMPRESS VARBYTE parameter exactly.
  • TS_COMPRESS

    Compresses TIME and TIMESTAMP data.

    TS_DECOMPRESS

    Uncompresses TIME and TIMESTAMP data that was compressed using TS_COMPRESS.

    JSON_COMPRESS

    Compresses JSON data.

    JSON_DECOMPRESS

    Uncompresses the JSON data that was compressed using JSON_COMPRESS.

    For detailed descriptions of these UDFs, see SQL Functions, Operators, Expressions, and Predicates .