Teradata supplies several multibyte character sets to support Simplified Chinese and Traditional Chinese on mainframe and network-attached clients.
Each character set uses a specific encoding form to distinguish single-byte characters from multibyte characters.
Character Set |
Description |
Encoding Form |
Simplified Chinese (IBM CCSID 935) for mainframe clients. |
EBCDIC Shift-Out/Shift-In. Shift-out character 0x0E and shift-in character 0x0F bracket each string of double-byte characters. |
|
Traditional Chinese (IBM CCSID 937) for mainframe clients. |
||
Simplified Chinese (mixed GB2312) for network-attached clients. |
Extended UNIX Code (EUC) composed of two code sets: cs0 for single-byte characters and cs1 for double-byte characters. |
|
Traditional Chinese (Big5) for network-attached clients. |
Value of first byte in sequence distinguishes single-byte characters from double-byte characters. |
|
Simplified Chinese (mixed GB2312) for network-attached clients. |
Value of first byte in sequence distinguishes single-byte characters from double-byte characters. |
|
Traditional Chinese (Big5) for network-attached clients. |
Value of first byte in sequence distinguishes single-byte characters from double-byte characters. |
To determine whether a Chinese character is valid in an object name:
1 Find the text file on the documentation CD or on the Web at http://www.info.teradata.com/ that maps the character set to UNICODE.
2 In the text file, find the Unicode character to which the client character in question maps.
3 Find the file that identifies valid Unicode characters, UOBJNEXT.txt, available on the Teradata User Documentation CD and at http://www.info.teradata.com.
4 If the Unicode character appears in the file that is applicable to your system, you can use the client character that maps to it in an object name.
Character data entered using Chinese client character sets should be stored in columns defined as Unicode. The UNICODE server character set requires two bytes of storage per character so that a CHAR(5) CHARACTER SET UNICODE field occupies 10 bytes of storage.
Given the 64000 byte limit on column size, a column cannot exceed 32000 characters. Furthermore, the combination of character data and other data types cannot exceed the 64000 byte limit on row size.