Shift-JIS (DOS Kanji) Encoding - Teradata Database

International Character Set Support

Product
Teradata Database
Release Number
15.10
Language
English (United States)
Last Update
2018-09-25
dita:id
B035-1125
lifecycle
previous
Product Category
Teradata® Database

For Windows client systems, the Teradata Database supports Shift-JIS encoding.

The KANJISJIS_0S client character set emulates the Shift-JIS style of mixed single- and multibyte character data, where the range of the first byte in a character determines if the character is represented as one byte or two bytes.

DOS/V is an implementation of a Japanese character set that uses the undefined columns of JIS X 0201; those bytes are the first bytes for 2-byte Kanji characters. This encoding is referred to as the Shift-JIS encoding.

The following tables show the Shift-JIS encoding according to character values and selected Shift-JIS characters. The figures in “Shift-JIS Encoding for Kanji” and “Shift-JIS Encoding: Detailed View” illustrate the encoding ranges.

 

Hex Representation of Shift-JIS

Shift-JIS Implementation

0x00-0x7E, 0xA1-0xDF

JIS X 0201

0x81-0x9F, 0xE0-0xFC

First byte of double-byte representation. Its mapping is as follows:

1. 0x81-0x9F--Contains rows 1 to 62 from JIS X 0208.

2. 0xE0-0xEF--Contains rows 63 to 94 from JIS X 0208.

3. 0xF0-0xF9--Contains 1,880 Gaiji characters.

4. 0xFA-0xFC--Contains IBM-defined characters.

0x40-0x7E, 0x80-0xFC

Second byte of double-byte representation.

 

Double-byte Space

Double-byte Underscore

Double-byte Percent

0x8140

0x8151

0x8193

Note that although this graphic shows data at second byte 0x7F, there is none, as documented in the Shift-JIS encoding table (see “Shift-JIS Encoding” on page 165).

The next figure (“Shift-JIS Encoding: Detailed View” on page 167) shows a more detailed view of the Shift-JIS encoding. The shaded regions show the JIS X 0208 area, the Gaiji area, and the IBM‑defined character area. Observe that only the range 0x8000-0xFFFF is shown.

This figure illustrates the following:

  • Shift-JIS range 0x8140-0x9FFC contains rows 1 to 62 from JIS X 0208.
  • Shift-JIS range 0xE040-0xEFFC contains rows 63 to 94 from JIS X 0208.
  • Shift-JIS range 0xF040-0xF9FC contains 1,880 Gaiji characters. This range is equivalent to EUC ss3 range 0xA1A1-0xB4FE.
  • Shift-JIS range 0xFA40-0xFCFC contains IBM-defined characters. IBM characters exist in IBM EBCDIC Kanji, but not in JIS X 0208.
  •  

    For more information on …

    See …

    the JIS X 0201 standard

    “JIS X 0201” on page 151.

    the JIS X 0208 standard

    “JIS X 0208” on page 153.

    the KANJISJIS_0S client character set that emulates Shift-JIS

    “Windows-Compatible Japanese Character Sets” on page 37.