KanjiEUC refers to the character set KANJIEUC_0U, which is compatible with the UNIX operating system.
KanjiEUC emulates the standard Extended UNIX Code style of mixed single- and multibyte character data, where the most significant bit of each byte classifies the byte as a single-byte character or part of a multibyte character.
The KanjiEUC character set includes all characters in the JIS X 0201, JIS X 0208, and JIS 0212 standards, plus extensions.
The valid ranges for JIS X 0201 characters in KanjiEUC dictionary data include all of U.S. ASCII and the portion of JIS X 0201 for which the second byte ranges from A1 through DF. See the rows for code set 0 (cs0) and code set 2 (cs2) in the succeeding table.
KanjiEUC uses the four external code sets defined in the following table.
Code Set |
Description |
Note |
0 |
Single-byte character data Not permitted for the GRAPHIC server character set |
JIS X 0201 For the detailed encoding, see “Shift-JIS Encoding: Detailed View” on page 167. |
1 |
Two-byte character data |
JIS X 0208 |
2 |
Two-byte multibyte character with first byte ss2=0x8E Not permitted for the GRAPHIC server character set |
JIS X 0201 Hankaku Katakana |
3 |
Three-byte multibyte character with first byte ss3=0x8F |
JIS X 0212 |
Object names on systems enabled with Japanese language support can contain single-byte Latin and Katakana characters from the JIS X 0201 standard, and double-byte characters from the JIS X 0208 standard.
The valid ranges for JIS X 0201 characters in object names under the KanjiEUC client character set appear in rows cs0 and cs2 in “KanjiEUC Code Set Localization” on page 162. The set does not permit Katakana symbols 0x8EA1—0x8EA5 nor Unicode symbols other than $, #, and _.
The valid ranges for JIS X 0208 characters in object names under the KanjiEUC client character set appear in row cs1 in “KanjiEUC Code Set Localization” on page 162. Characters in the reserved regions of the standard are not allowed.
Characters from JIS X 0212 (row cs3) are not valid in object names. Additionally, some characters that are valid in JIS X 0208 do not map to the KanjiEBCDIC encoding and are not valid in KanjiEUC object names. The following table provides a complete list of multibyte character codes that are not valid for object names under the KANJIEUC_0U character set.
First Byte |
Second Byte |
Third Byte |
|
0xA1 |
0xA1 - 0xAA |
0xAD - 0xB1 |
|
0xB3 - 0xBB |
0xBD - 0xEF |
|
|
0xF1 - 0xF3 |
0xF5 - 0xFE |
|
|
0xA2 |
0xA1 - 0xFE |
|
|
0xA6 - 0xA8 |
0xA1 - 0xFE |
|
|
0xF4 |
0xA5 - 0xA6 |
|
|
0x8E |
0xA1 - 0xA5 |
|
|
0x8F |
0xA1 - 0xFE |
0xA1 - 0xFE |
For information on the rules and restrictions for naming Teradata Database objects, see the topics beginning with “About Object Names” on page 17.
Also see SQL Fundamentals, which covers topics such as:
For more information on … |
See … |
the JIS X 0201 standard |
|
the JIS X 0208 standard |
|
the standard Extended UNIX Code (EUC) style of mixed single- and multibyte character data |