15.00 - CAMSET - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)
Last Update
2018-09-24

CAMSET

Purpose  

Compresses the specified Unicode character data into the following possible values using a proprietary Teradata algorithm:

  • partial byte values (for example, 4-bit digits or 5-bit alphabetic letters)
  • one byte values (for example, other Latin characters)
  • two byte values (for example, other Unicode characters)
  • Syntax  

    where:

     

    Syntax element…

    Specifies…

    TD_SYSFNLIB

    the name of the database where the function is located.

    Unicode_string

    a Unicode character string or string expression.

    Note: This function takes no arguments when used as part of the COMPRESS USING or DECOMPRESS USING phrases. For more information about the COMPRESS/DECOMPRESS phrase, see SQL Data Types and Literals.

    ANSI Compliance

    This is a Teradata extension to the ANSI SQL:2011 standard.

    Argument Type and Rules

    Expressions passed to this function must have a data type of VARCHAR(n) CHARACTER SET UNICODE, where the maximum supported size (n) is 32000. You can also pass arguments with data types that can be converted to VARCHAR(32000) CHARACTER SET UNICODE using the implicit data type conversion rules that apply to UDFs. For example, CAMSET(CHAR) is allowed because it can be implicitly converted to CAMSET(VARCHAR).

    Note: The UDF implicit type conversion rules are more restrictive than the implicit type conversion rules normally used by Teradata Database. If an argument cannot be converted to VARCHAR following the UDF implicit conversion rules, it must be explicitly cast.

    For details, see “Compatible Types” in SQL External Routine Programming.

    The input to this function must be Unicode character data.

    If you specify NULL as input, the function returns NULL.

    Result Type

    The result data type is VARBYTE(64000).

    Usage Notes

    Uncompressed character data in Teradata Database requires 2 bytes per character when storing Unicode data. CAMSET takes Unicode character input, compresses it into partial byte, one byte, or two byte values, and returns the compressed result.

    CAMSET provides best results for short or medium Unicode strings that:

  • contain mainly digits and English alphabet letters.
  • do not frequently switch between:
  • lowercase and uppercase letters.
  • digits and letters.
  • Latin and non-Latin characters.
  • For a detailed comparison between the Teradata-supplied compression functions and guidelines for choosing a compression function, see Database Administration.

    Although you can call the function directly, CAMSET is normally used with algorithmic compression (ALC) to compress table columns. If CAMSET is used with ALC, nulls are also compressed if those columns are nullable.

    For more information about ALC, see “COMPRESS and DECOMPRESS Phrases” in SQL Data Types and Literals.

    Restrictions

    CAMSET currently can only compress Unicode characters from U+0000 to U+00FF.

    Uncompressing Data Compressed with CAMSET

    To uncompress Unicode data that was compressed using CAMSET, use the DECAMSET function. See “DECAMSET” on page 528.

    Example

    In this example, the Unicode values in the Description column are compressed using the CAMSET function with ALC. The DECAMSET function uncompresses the previously compressed values.

       CREATE MULTISET TABLE Pendants
          (ItemNo INTEGER,
           Gem CHAR(10) UPPERCASE CHARACTER SET UNICODE,
           Description VARCHAR(1000) CHARACTER SET UNICODE
              COMPRESS USING TD_SYSFNLIB.CAMSET
              DECOMPRESS USING TD_SYSFNLIB.DECAMSET);

    Example

    Given the following table definition:

       CREATE TABLE Pendants
          (ItemNo INTEGER,
           Description VARCHAR(100) CHARACTER SET UNICODE);

    The following query returns the compressed values of the Description column.

       SELECT TD_SYSFNLIB.CAMSET(Pendants.Description);