KANJI1 Character Set - Teradata Database

The KANJI1 server character set is designed for Japanese applications that must remain compatible with Teradata Kanji data from Teradata Database releases prior to V2R3.0.

The semantics and limitations of KANJI1 are identical to those of the CHARACTER server character set in releases prior to V2R3.0.

Notice:

In accordance with Teradata internationalization plans, KANJI1 support is deprecated and is to be discontinued in the near future. KANJI1 is not allowed as a default character set; the system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible. “”

See “KANJI1 Restrictions” on page 67.

The JIS X 0201 mapping standard is the basis for the KANJI1 server character set.

Because it stores client character data in the form-of-use in which it receives it, KANJI1 is non-canonical and cannot be shared among heterogeneous clients.

Notice:

Note: Upon upgrading to Teradata Database 14.0 or greater, the system automatically replaces DEFAULT CHARACTER SET KANJI1 with DEFAULT CHARACTER SET UNICODE in existing user definitions.

As part of the plans for discontinuing Kanji1 support, the creation of new Kanji1 objects is highly restricted. Inclusion of the phrase CHARACTER SET KANJI1 in the following statements returns a syntax error:

CREATE USER/MODIFY USER

CREATE TABLE/ALTER TABLE

CREATE FUNCTION/REPLACE FUNCTION

CREATE TYPE/ALTER TYPE

CREATE PROCEDURE/REPLACE PROCEDURE

CREATE MACRO/REPLACE MACRO

CREATE VIEW/REPLACE VIEW

CAST function

Plan to use the TRANSLATE function to convert existing Kanji1 data to Unicode or another supported server character set. For details, see “TRANSLATE” in SQL Functions, Operators, Expressions, and Predicates.

The KANJI1 server character set is designed to support Japanese characters only when using specifically designed client character sets.

KANJI1 supports single-byte characters from the following client character sets:

EBCDIC

ASCII

UTF8

Only single-byte characters that are coded the same in the LATIN and JIS X 0201 standard can be used.

KANJI1 supports mixed single- and multibyte characters from the following client character sets:

KanjiEBCDIC

Single-byte data uses JIS X 0201

Double-byte data uses SO/SI

KanjiSJIS_0S

JIS X 0201

JIS X 0208

Kanji1932_1S0

JIS X 0201

JIS X 0208

KanjiEUC_0U

JIS X 0201

Bytes 00-7F as defined

Byte 8E mapped to 80

JIS X 0208 (converted to KanjiSJIS)

JIS X 0212

Byte 8F mapped to FF

Only a limited set of characters stored as KANJI1 can be retrieved correctly by a client character set other than the one that entered the data. This limited set includes most 7-bit ASCII characters:

The letters A-Z and a-z

The digits 0-9

Various punctuation, symbols, and control characters

In general, it does not include:

Japanese characters

The backslash (\)

Yen sign

Tilde

Overline

Attempts to retrieve Japanese or other non-sharable characters from a KANJI1 field may result in error messages for the following character sets:

UTF8

UTF16

Most site-defined client character sets using map files

For other character sets, attempts to retrieve Japanese or other non-sharable characters from a KANJI1 field may result in improperly translated data.

KANJI1 character data is usually a mixture of single- and multibyte characters. Therefore, even when a session uses a non-Japanese client character set, such as standard EBCDIC or ASCII, certain character configurations are interpreted either as starting a multibyte character string or as control characters.

The following interpretations apply to name and data characters for all client character sets when the server character set is KANJI1.

Characters with a client encoding of …

Are …

0x0E

0x0F

interpreted as the Shift-Out or Shift-In character respectively, which delimit the start or end of a multibyte character string.

0x80

0xFF

translated internally into characters that are reserved for the ss₂ and ss₃ escape characters of KanjiEUC data.

For details about the KANJI1 server character set, see SQL Data Types and Literals.