Compressing Column Values Using Only Algorithmic Compression - Advanced SQL Engine - Teradata Database

SQL Data Definition Language Detailed Topics

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
imq1591724555718.ditamap
dita:ditavalPath
imq1591724555718.ditaval
dita:id
B035-1184
lifecycle
previous
Product Category
Teradata Vantage™
The term compression is used to mean different things for Teradata systems. Algorithmic compression can be done using either a non-loss method or a lossy method.
  • When describing algorithmic compression of column values, compression refers to whatever methods you choose to implement to compress byte, character, and graphic data. Algorithmic compression is implemented by assigning a compression and a decompression UDF to a table column. Although you can only specify one compression UDF and one decompression UDF per column, those functions can compress and decompress multiple column values. For the syntax used to assign algorithmic compression UDFs to columns, see Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144. For details about coding UDFs to implement algorithmic compression, see Teradata Vantage™ - SQL External Routine Programming, B035-1147.
  • When you specify algorithmic compression for a column, you might be using a form of lossless compression, or you might be using lossy compression for sound, video, graphic, or picture files.
  • When you specify a UDF name for algorithmic compression, you must also specify its containing database or user.
  • You can specify algorithmic compression only for columns having one of the following data types.
    • BYTE
    • CHARACTER
    • VARBYTE
    • VARCHAR
    • All UDT types that are not based on BLOBs or CLOBs.

    You can specify algorithmic compression for Period types.

  • Algorithmic compression is not supported for the columns of join indexes
  • Algorithmic compression automatically compresses nulls.
  • Algorithmic compression is carried to spools whenever possible.
  • You can specify algorithmic compression for an unlimited number of columns in a table.
  • The column DBC.TVFields.CompressValueList stores the names of the compression and decompression algorithms specified for a column as well as the compressed multivalue for any multivalue compression specified for a column.

    The maximum size of DBC.TVFields.CompressValueList is 8,192 characters. If the composite size of the value and algorithmic compression information exceeds this limit, the system reports an error.

The following rules apply to the compression and decompression algorithms you create as UDFs.
  • The UDFs must be created in either the SYSUDTLIB or the TD_SYSFNLIB database.
  • The UDFs must be scalar UDFs.

    Neither aggregate nor table UDFs are valid for algorithmic compression.

  • The UDFs must be deterministic.
  • The UDFs can only have 1 input parameter, and the data type of the value passed must be a type that is valid for algorithmic compression.
  • The input to an DECOMPRESS USING UDF must have a variable data type.

    The output from an DECOMPRESS USING UDF must be compatible with the data type of the column being decompressed.

  • Algorithmic compression cannot truncate data values.

    The length of the column must be less than or equal to the length of the input parameter passed to the COMPRESS USING UDF.

    Similarly, the length of the column must be less than or equal to the length of the output parameter returned by the DECOMPRESS USING UDF.

  • A user who references a column defined with algorithmic compression must have the privilege on the COMPRESS USING and DECOMPRESS USING UDFs specified for that column.
  • Vantage supports function overloading for the COMPRESS USING and DECOMPRESS USING UDFs you write.

    For example, you can overload an algorithm that handles different data types, and the system executes the appropriate function depending on the data type of the input parameter value passed to it.

  • Vantage supports algorithmic compression of Period columns.

The costs of compressing and decompressing column data algorithmically depends on the algorithms used, and you must evaluate those costs yourself.

Teradata provides the following system-defined external UDFs that you can use for algorithmic compression and decompression. See Teradata Vantage™ - SQL Operators and User-Defined Functions, B035-1210. These functions are stored in the TD_SYSFNLIB database.