NAME identifies the character set to which the description applies. The name may include a standard suffix that defines the encoding scheme. The standard suffix consists of an underscore, a number not relevant to CLIv2, the encoding character (A, E, I, R, S, T, or U), and an optional character not relevant to CLIv2. Each suffix corresponds to an ENCODING operand value:
- E - EDBDIC
- I - IBMSOSI
- A - ASCII
- R - BIGFIVE
- S - SJIS
- T - EUC-CN or EUC-KR
- U - EUC-JP
ENCODING optionally identifies the encoding scheme for the character set. If omitted, the character set must contain a standard suffix that indicates the encoding. If such a suffix exists, then the encoding cannot be overridden using this operand. The following character sets are available in CLIv2.
ENCODING | Meaning | Characteristics |
---|---|---|
EBCDIC | Extended Binary-Coded-Decimal Interchange Code |
|
IBMOSI | IBM Shift-out/Shift-in |
|
ASCII | American Standard Code for Information Interchange |
|
BIGFIVE | Big Five Plus |
|
EUC-CN | Extended Unix Code - China |
|
EUC-JP | Extended Unix Code - Japan |
|
EUC-KR | Extended Unix Code - Korea |
|
SJIS | Shift-JIS (Japanese Industrial Standard) |
|
UHC | Unified Hangul Code |
|
UTF8 | UCS (Universal Character Set) Transformation Format 8-bit |
Most four-byte codepoints (X'F0' through X'F4') are not supported by Teradata Database. |
UTF16 | UCS (Universal Character Set) Transformation Format- 16-bit |
Surrogates (four-byte codepoints that begin or end with the two-byte codepoints X'D800' through X'DBFF') are not supported by Teradata Database. |
While all codepoints are reflected to and from Teradata Database, for character sets that allow mixtures of single and multi-byte characters, only the single-byte characters are meaningful to CLIv2.
Example
Begin definition for IBM Code Page 833, the single-byte component for IBM CCSID 933.
CHARSET NAME KOREAN_EBCDIC933 ENCODING IBMSOSI