KANJI1 Character Set - Teradata Database

International Character Set Support

Product
Teradata Database
Release Number
15.10
Language
English (United States)
Last Update
2018-09-25
dita:id
B035-1125
lifecycle
previous
Product Category
Teradata® Database

The KANJI1 server character set is designed for Japanese applications that must remain compatible with Teradata Kanji data from Teradata Database releases prior to V2R3.0.

The semantics and limitations of KANJI1 are identical to those of the CHARACTER server character set in releases prior to V2R3.0.

Notice:

In accordance with Teradata internationalization plans, KANJI1 support is deprecated and is to be discontinued in the near future. KANJI1 is not allowed as a default character set; the system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible. “”

See “KANJI1 Restrictions” on page 67.

The JIS X 0201 mapping standard is the basis for the KANJI1 server character set.

Because it stores client character data in the form-of-use in which it receives it, KANJI1 is non-canonical and cannot be shared among heterogeneous clients.

Notice:

In accordance with Teradata internationalization plans, KANJI1 support is deprecated and is to be discontinued in the near future. KANJI1 is not allowed as a default character set; the system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible. “”

Note: Upon upgrading to Teradata Database 14.0 or greater, the system automatically replaces DEFAULT CHARACTER SET KANJI1 with DEFAULT CHARACTER SET UNICODE in existing user definitions.

As part of the plans for discontinuing Kanji1 support, the creation of new Kanji1 objects is highly restricted. Inclusion of the phrase CHARACTER SET KANJI1 in the following statements returns a syntax error:

  • CREATE USER/MODIFY USER
  • CREATE TABLE/ALTER TABLE
  • CREATE FUNCTION/REPLACE FUNCTION
  • CREATE TYPE/ALTER TYPE
  • CREATE PROCEDURE/REPLACE PROCEDURE
  • CREATE MACRO/REPLACE MACRO
  • CREATE VIEW/REPLACE VIEW
  • CAST function
  • Plan to use the TRANSLATE function to convert existing Kanji1 data to Unicode or another supported server character set. For details, see “TRANSLATE” in SQL Functions, Operators, Expressions, and Predicates.

    The KANJI1 server character set is designed to support Japanese characters only when using specifically designed client character sets.

    KANJI1 supports single-byte characters from the following client character sets:

  • EBCDIC
  • ASCII
  • UTF8
  • Only single-byte characters that are coded the same in the LATIN and JIS X 0201 standard can be used.

    KANJI1 supports mixed single- and multibyte characters from the following client character sets:

  • KanjiEBCDIC
  • Single-byte data uses JIS X 0201
  • Double-byte data uses SO/SI
  • KanjiSJIS_0S
  • JIS X 0201
  • JIS X 0208
  • Kanji1932_1S0
  • JIS X 0201
  • JIS X 0208
  • KanjiEUC_0U
  • JIS X 0201
  • Bytes 00-7F as defined
  • Byte 8E mapped to 80
  • JIS X 0208 (converted to KanjiSJIS)
  • JIS X 0212
  • Byte 8F mapped to FF
  • Only a limited set of characters stored as KANJI1 can be retrieved correctly by a client character set other than the one that entered the data. This limited set includes most 7-bit ASCII characters:

  • The letters A-Z and a-z
  • The digits 0-9
  • Various punctuation, symbols, and control characters
  • In general, it does not include:

  • Japanese characters
  • The backslash (\)
  • Yen sign
  • Tilde
  • Overline
  • Attempts to retrieve Japanese or other non-sharable characters from a KANJI1 field may result in error messages for the following character sets:

  • UTF8
  • UTF16
  • Most site-defined client character sets using map files
  • For other character sets, attempts to retrieve Japanese or other non-sharable characters from a KANJI1 field may result in improperly translated data.

    KANJI1 character data is usually a mixture of single- and multibyte characters. Therefore, even when a session uses a non-Japanese client character set, such as standard EBCDIC or ASCII, certain character configurations are interpreted either as starting a multibyte character string or as control characters.

    The following interpretations apply to name and data characters for all client character sets when the server character set is KANJI1.

     

    Characters with a client encoding of …

    Are …

    0x0E

    0x0F

    interpreted as the Shift-Out or Shift-In character respectively, which delimit the start or end of a multibyte character string.

    0x80

    0xFF

    translated internally into characters that are reserved for the ss2 and ss3 escape characters of KanjiEUC data.

    For details about the KANJI1 server character set, see SQL Data Types and Literals.