15.00 - Effects of Server Character Sets on Character String Functions - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)
Last Update
2018-09-24

Effects of Server Character Sets on Character String Functions

String functions that operate on character data follow the rules listed below.

Uppercase Character Conversion for LATIN

For the LATIN server character set, the method of converting to uppercase characters is based on ISO 8859 Latin1.

Logical Characters vs. Physical Characters

For UNICODE, GRAPHIC and KANJISJIS server character sets, the functions operate on a logical character basis, except for the functions that are sensitive to the ANSI mode vs. Teradata mode switch.

Although the storage space for KANJISJIS is allocated on a physical basis and is not ANSI compatible, all string operations on this type operate on a character basis as dictated by ANSI.

Untranslatable KANJI1 Characters

Caution:

In accordance with Teradata internationalization plans, KANJI1 support is deprecated and is to be discontinued in the near future. KANJI1 is not allowed as a default character set; the system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible. “”

Character string functions do not work on all characters in the KANJI1 server character set when the session character set is UTF8 or UTF16, because the KANJI1 server character set is ambiguous in regard to multibyte characters and some single-byte characters.

Recommendation: Unless the KANJI1 server character set is required, use the UNICODE server character set with the UTF8 and UTF16 session character sets for best results.

The following single-byte characters in KanjiEBCDIC to KANJI1 translations are mapped to the following Unicode character names.

 

Hexadecimal Value

Character

Unicode Character Name

0x10

¢

CENT SIGN

0x11

£

POUND SIGN

0x12

¬

NOT SIGN

0x13

\

REVERSE SOLIDUS

0x14

~

TILDE

However, with a KanjiSJIS character set, these hexadecimal values map to control characters.

Implicit Server Character Set Translation

For functions that operate on more than one argument, if the arguments have different server character sets, implicit translation rules take effect.

For details, see “Implicit Character-to-Character Translation” on page 603.