15.10 - CHARACTER Data Type - Teradata Database

Teradata Database SQL Data Types and Literals

prodname
Teradata Database
vrm_release
15.10
category
Programming Reference
featnum
B035-1143-151K

Represents a fixed length character string for Teradata Database internal character storage.

where:

 

Syntax element …

Specifies …

n

the number of characters or bytes allotted to the column defined with this server character set:

  • For the LATIN server character set, the maximum value for n is 64000 characters.
  • For the UNICODE and GRAPHIC server character sets, the maximum value for n is 32000 characters.
  • For the KANJISJIS server character set, the maximum value for n is 32000 bytes.
  • If a value for n is not specified, the default is 1.

    server_character_set

    the server character set for the character column being defined. See “CHARACTER SET Phrase” on page 226.

    If the CHARACTER SET server_character_set clause is omitted, the default server character set depends on how the user is defined in the DEFAULT CHARACTER SET clause of the CREATE USER statement. See “CREATE USER” in SQL Data Definition Language.

     

    Notice:

    KANJI1 support is deprecated. KANJI1 is not allowed as a default character set. The system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible.

    Supported values for server_character_set are as follows:

  • LATIN represents fixed 8-bit characters from the ASCII ISO 8859 Latin1 or ISO 8859 Latin9 repertoires.
  • See “LATIN Server Character Set” on page 228.

  • UNICODE represents fixed 16-bit characters from the UNICODE 6.0 standard.
  • See “UNICODE Server Character Set” on page 229.

  • GRAPHIC represents fixed 16-bit UNICODE characters defined by IBM Corporation for DB2.
  • See “GRAPHIC Server Character Set” on page 230.

  • KANJISJIS represents mixed single byte/multibyte characters intended for Japanese applications that rely on KanjiShiftJIS characteristics.
  • See “KANJISJIS Server Character Set” on page 231.

    attributes

    appropriate data type, column storage, or column constraint attributes.

    See “Core Data Type Attributes” on page 17 and “Storage and Constraint Attributes” on page 18 for specific information.

    CHARACTER is ANSI SQL:2011 compliant.

    GRAPHIC is a Teradata extension to the ANSI SQL:2011 standard.

    Character data is allocated either in terms of characters or in terms of bytes, depending on the server character set used. The number of bytes of storage per character also varies depending on the server character set, as illustrated by the following table.

     

    Server Character Set

    Server Form-of-Use

    Server Space Allocation

    Sharable Among Heterogeneous Clients?

    LATIN

    Fixed 8-bit LATIN

    Character-based

    Yes

    UNICODE

    Fixed16-bit UNICODE

    GRAPHIC

    Fixed 16-bit UNICODE

    KANJISJIS

    Mixed single and multibyte KANJISJIS

    Byte-based

    Yes

    Whenever a client application communicates with Teradata Database, it indicates its character set (form-of-use for character data). The server returns all character data to the client application in that form.

    Any conversion to or from the client system data types is done by Teradata Database.

    For information on the number of bytes exported for the CHARACTER type, see “Exported Character Data” on page 225.

    The default display format of CHARACTER(n) is X(n). For example, X(5), where data ‘HELLO’ displays as ‘HELLO’.

    You can use GRAPHIC to represent multibyte character data.

    GRAPHIC(n) is equivalent to CHARACTER(n) CHARACTER SET GRAPHIC. For best practice, define all GRAPHIC(n) data as CHARACTER(n) CHARACTER SET GRAPHIC.

    Each multibyte character in a graphic string is stored assuming two bytes per logical character. Therefore, a graphic data string always represents an even multiple of bytes.

    If you specify GRAPHIC without the length (n), the default is GRAPHIC(1).

    The following rules apply to truncation and padding of GRAPHIC data.

     

    IF a graphic string is …

    THEN …

    shorter than the specified length of the column

    the remaining space is filled with the graphic pad character.

    longer than the specified length of the column

    the extra characters are truncated.

    GRAPHIC types accommodate the following client character sets:

  • KanjiEBCDIC double byte graphic data
  • KanjiShift-JIS for double byte Shift-JIS codes
  • KanjiEUC for fixed-length, double byte EUC characters
  • Use the following syntax for KanjiEBCDIC graphic string literals:

    Under a KanjiEBCDIC character set, multibyte characters in a graphic string constant must be delimited with the Shift-Out/Shift-In characters; for example:

       INSERT INTO TableEBCDIC (ColGRAPH) 
        VALUES (G'<AB>');

    where AB is a valid string of KanjiEBCDIC multibyte characters, G specifies the string must be in the Graphic repertoire, and each apostrophe is a single byte character.

    The following table lists the client representation for the IBM DB2 GRAPHIC type.

    Determining the application definitions and client data types is the responsibility of the application programmer.

     

    Client CPU Architecture

    Client Internal Data Format

    IBM mainframe

    2n bytes of n DB2 GRAPHIC characters.

     

    FOR information on …

    SEE …

    character literals

    “Character String Literals” on page 88.

    conversion of external-to-internal and internal-to-external character data, including truncation and error handling

    International Character Set Support.

    In the following table definition, the column named Sex is assigned the CHARACTER data type with a length of one, and the column named Frgn_Lang is assigned the CHARACTER data type with a length of seven.

       CREATE TABLE PersonalData
         (Id INTEGER
         ,Age INTEGER
         ,Sex CHARACTER NOT NULL UPPERCASE
         ,Frgn_Lang CHARACTER(7) NULL UPPERCASE );

    Consider the following table:

       CREATE TABLE Product1Data
         (id1 INTEGER
         ,code1 CHARACTER(3) CHARACTER SET GRAPHIC);

    Assume that column code1 contains the following data:

       457F4577456D

    Under a KanjiEBCDIC session in record or indicator mode, the contents of code1 are returned to the user as follows:

       457F4577456D 

    Under a KanjiEBCDIC session in field mode, the contents of code1 are returned to the user in proper format, as follows:

       0E457F4577456D0F