17.20 - Definitions - Analytics Database - Teradata Vantage

Teradata Vantage™ - Analytics Database International Character Set Support - 17.20

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2023-01-27
dita:mapPath
aju1628095815656.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
hqj1472245413611

EUC is composed of one primary and three supplementary codesets.

The primary codeset, codeset 0, is used for ASCII characters.

The three supplementary code sets, code sets 1, 2, and 3, can be assigned to different character sets by the user.

There is a system default assignment for these codesets.

The primary code set is defined to be a single-byte with the most significant (high-order) bit set to 0. The supplementary codesets can be multiple bytes, and the most significant bit of each is set to 1.

Code sets 2 and 3 have a preceding single-shift character, known as ss 2 and ss 3, respectively, where ss 2 is 0x8E and ss 3 is 0x8F. Differentiation between codesets is as follows.

IF the most significant bit is this value … THEN …
0 the code set is one-byte ASCII.
1 the byte is checked for ss 2 or ss 3 to determine the code set. The length in bytes of characters from that code set is retrieved from an ANSI localization table governing character classification, and that number of bytes is read in.