Japanese Encoding Schemes - Teradata Database

International Character Set Support

Product

Teradata Database

Release Number

15.10

Language

English (United States)

Last Update

2018-09-25

dita:id

B035-1125

lifecycle

Product Category

Teradata® Database

Because thousands of characters are required to write Japanese, it is not possible to represent all characters as a single-byte. For this reason, Japanese character sets use either:

A multibyte mapping standard.

The combination of a multibyte standard to handle most of the enormous number of required characters, and a single-byte standard to efficiently code a smaller number of frequently used characters.

There are several mapping standards used in the character sets supported under the Teradata Database Japanese character support.

Standard	Description
JIS X 0201	Similar to the ISO 8859 family of protocols with the exception that there are some changes in the ASCII region. The area from 0xA1-0xDF is used mainly for Hankaku Katakana.
JIS X 0208	A double-byte standard that includes the more common Kanji characters along with many uncommon ones. It also includes Hiragana, Katakana and Zenkaku Romaji characters, as well as Greek, Cyrillic, and various other characters.
JIS X 0212	A double-byte standard that was designed to include many of the rarer Kanji characters.
IBM Code Page 300	A double-byte standard similar in content to JIS X 0208, but designed for an EBCDIC platform.
IBM-provided single-byte standards for Japanese	Based on EBCDIC, but include Hankaku Katakana characters. These mapping standards are described in more detail in the descriptions of individual supported character sets.
UTF-8	A version of Unicode optimized for backward compatibility with ASCII. In Teradata UTF8, a character can consist of from one to three bytes.

For more information on Japanese encodings and mapping standards, see Appendix B: “Japanese Encodings and Mapping Standards.”