Because thousands of characters are required to write Japanese, it is not possible to represent all characters as a single-byte. For this reason, Japanese character sets use either:
There are several mapping standards used in the character sets supported under the Teradata Database Japanese character support.
Standard |
Description |
JIS X 0201 |
Similar to the ISO 8859 family of protocols with the exception that there are some changes in the ASCII region. The area from 0xA1-0xDF is used mainly for Hankaku Katakana. |
JIS X 0208 |
A double-byte standard that includes the more common Kanji characters along with many uncommon ones. It also includes Hiragana, Katakana and Zenkaku Romaji characters, as well as Greek, Cyrillic, and various other characters. |
JIS X 0212 |
A double-byte standard that was designed to include many of the rarer Kanji characters. |
IBM Code Page 300 |
A double-byte standard similar in content to JIS X 0208, but designed for an EBCDIC platform. |
IBM-provided single-byte standards for Japanese |
Based on EBCDIC, but include Hankaku Katakana characters. These mapping standards are described in more detail in the descriptions of individual supported character sets. |
UTF-8 |
A version of Unicode optimized for backward compatibility with ASCII. In Teradata UTF8, a character can consist of from one to three bytes. |
For more information on Japanese encodings and mapping standards, see Appendix B: “Japanese Encodings and Mapping Standards.”