About Extended Character Sets - Teradata Database

International Character Set Support

Product

Teradata Database

Release Number

15.10

Language

English (United States)

Last Update

2018-09-25

dita:id

B035-1125

lifecycle

Product Category

Teradata® Database

An extended site-defined character set defines the mapping of hexadecimal values to characters for single- and multibyte components of a character set. If you use a non-Western European language such as Russian, Arabic, or Urdu, you can define and install your own single-byte character set. If you use a non-Western European language such as Japanese, Korean, or Chinese, and the Teradata-supplied character sets are not entirely sufficient for your site, you can define and install your own multibyte character set.

Extended site-defined character sets can support, with certain constraints, any subset of the Unicode repertoire.

A user who is sufficiently privileged can define the relevant client character set, mapping bytes from the client to their corresponding Unicode values.

Character data entered using extended site-defined client character sets should be stored in columns defined as Unicode. The UNICODE server character set requires two bytes of storage per character so that a CHAR(5) CHARACTER SET UNICODE field occupies 10 bytes of storage.

Given the 64000 byte limit on column size, a column cannot exceed 32000 characters. Furthermore, the combination of character data and other data types cannot exceed the 64000 byte limit on row size.

A user who is sufficiently privileged can also define an appropriate collation.

If a custom collation is required, and CHARSET_COLL collation does not produce the desired result, then you can modify the MULTINATIONAL collation. For information, see “MULTINATIONAL Collation for Extended Site-Defined Character Sets” on page 124.

Only sufficiently privileged users can define extended site-defined character sets and collations. A sufficiently privileged user is one who can:

Edit and place files in the appropriate directories on every node in the Teradata Database.

Modify records in DBC.CharTranslationsV and DBC.CollationsV.

Restart the Teradata Database.

Teradata ships the following characters sets that you can use as examples of extended site-defined character sets:

SCHEBCDIC935_2IJ

TCHEBCDIC937_3IB

HANGULEBCDIC933_1II

SCHGB2312_1T0

TCHBIG5_1R0

HANGULKSC5601_2R4

KANJI932_1S0

LATIN1252_3A0

LATIN1250_1A0

LATIN1254_7A0

LATIN1258_8A0

SCHINESE936_6R0

HANGUL949_7R0

HEBREW1255_5A0

ARABIC1256_6A0

CYRILLIC1251_2A0

THAI1874_4A0

TCHINESE950_8R0