Compressing Column Values Using Only Algorithmic Compression - Analytics Database - Teradata Vantage

SQL Data Definition Language Detailed Topics

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
vuk1628111288877.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
jbg1472252759029
lifecycle
latest
Product Category
Teradata Vantage™
The term compression is used to mean different things for Teradata systems. Algorithmic compression can be done using either a non-loss method or a lossy method.
  • When describing algorithmic compression of column values, compression refers to whatever methods you choose to implement to compress byte, character, and graphic data. Algorithmic compression is implemented by assigning a compression and a decompression UDF to a table column. Although you can only specify one compression UDF and one decompression UDF per column, those functions can compress and decompress multiple column values. For the syntax used to assign algorithmic compression UDFs to columns, see Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144. For details about coding UDFs to implement algorithmic compression, see Teradata Vantage™ - SQL External Routine Programming, B035-1147.
  • When you specify algorithmic compression for a column, you might be using a form of lossless compression, or you might be using lossy compression for sound, video, graphic, or picture files.
  • When you specify a UDF name for algorithmic compression, you must also specify its containing database or user.
  • You can specify algorithmic compression only for columns having one of the following data types.
    • BYTE
    • CHARACTER
    • VARBYTE
    • VARCHAR
    • All UDT types that are not based on BLOBs or CLOBs.

    You can specify algorithmic compression for Period types.

  • Algorithmic compression is not supported for the columns of join indexes
  • Algorithmic compression automatically compresses nulls.
  • Algorithmic compression is carried to spools whenever possible.
  • You can specify algorithmic compression for an unlimited number of columns in a table.
  • The column DBC.TVFields.CompressValueList stores the names of the compression and decompression algorithms specified for a column as well as the compressed multivalue for any multivalue compression specified for a column.

    The maximum size of DBC.TVFields.CompressValueList is 8,192 characters. If the composite size of the value and algorithmic compression information exceeds this limit, the system reports an error.

The following rules apply to the compression and decompression algorithms you create as UDFs.
  • The UDFs must be created in either the SYSUDTLIB or the TD_SYSFNLIB database.
  • The UDFs must be scalar UDFs.

    Neither aggregate nor table UDFs are valid for algorithmic compression.

  • The UDFs must be deterministic.
  • The UDFs can only have 1 input parameter, and the data type of the value passed must be a type that is valid for algorithmic compression.
  • The input to an DECOMPRESS USING UDF must have a variable data type.

    The output from an DECOMPRESS USING UDF must be compatible with the data type of the column being decompressed.

  • Algorithmic compression cannot truncate data values.

    The length of the column must be less than or equal to the length of the input parameter passed to the COMPRESS USING UDF.

    Similarly, the length of the column must be less than or equal to the length of the output parameter returned by the DECOMPRESS USING UDF.

  • A user who references a column defined with algorithmic compression must have the privilege on the COMPRESS USING and DECOMPRESS USING UDFs specified for that column.
  • Vantage supports function overloading for the COMPRESS USING and DECOMPRESS USING UDFs you write.

    For example, you can overload an algorithm that handles different data types, and the system executes the appropriate function depending on the data type of the input parameter value passed to it.

  • Vantage supports algorithmic compression of Period columns.

The costs of compressing and decompressing column data algorithmically depends on the algorithms used, and you must evaluate those costs yourself.

Teradata provides the following system-defined external UDFs that you can use for algorithmic compression and decompression. See Teradata Vantage™ - SQL Operators and User-Defined Functions, B035-1210. These functions are stored in the TD_SYSFNLIB database.