16.20 - Unicode Pass Through Overview - Teradata Vantage NewSQL Engine

Teradata Vantageā„¢ NewSQL Engine International Character Set Support

prodname
Teradata Database
Teradata Vantage NewSQL Engine
vrm_release
16.20
created_date
March 2019
category
Configuration
User Guide
featnum
B035-1125-162K

Unicode Pass Through (UPT) is a Unicode error handling feature. The feature gives users the ability to allow Pass Through Characters (PTCs) to be imported into and exported from Teradata. This increases the repertoire of characters available to Unicode users and provides the ability to store, retrieve, and analyze Emoji and other ideographs.

PTCs consist of the following:
  • Characters not supported by Teradata:
    • Characters in BMP (Basic Multilingual Plane) from Unicode versions 6.1.0 to 9.0.0
    • Characters in SMP (Supplementary Planes 1 to 16) from all Unicode versions
    These are the current unsupported characters in Teradata. This may change in the future if Teradata adds support for additional characters.
  • Unassigned characters
  • Private use characters

Unassigned characters are not assigned in the Unicode Standard and not supported by any implementation. Private use characters will never be assigned in the Unicode Standard and the usage is determined by a private agreement among cooperating implementations.

The following are not PTCs:
  • 6.0 BMP characters supported by Teradata
  • The set of noncharacter code points

Noncharacters are code points that are permanently reserved for internal use.

You should not use UPT if you rely on Teradata to screen out Teradata unsupported characters or the REPLACEMENT CHARACTER (U+FFFD).

Examples of PTCs and Noncharacters

Unicode Code Point Plane UTF-16 in hex Unicode Name, <noncharacter>, or <unassigned> Notes
U+20BD 0 20BD RUBLE SIGN Basic Multilingual Plane (BMP)
U+1F600 1 D83D DE00 GRINNING FACE Supplementary Multilingual Plane (SMP)
U+20000 2 D840 DC00 CJK Unified Ideographs Extension B Supplementary Ideographic Plane (SIP)
U+35555 3 D895 DD55 <unassigned> Tertiary Ideographic Plane (TIP). No assigned characters.
U+46666 4 D8D9 DE66 <unassigned> No characters defined on plane 4
U+5FFFE 5 D93F DFFE <noncharacter> No characters defined on plane 5
U+6FFFF 6 D97F DFFF <noncharacter> No characters defined on plane 6
U+E0103 14 DB40 DD03 VARIATION SELECTOR-20 Supplementary S
U+FFFF0 15 DBBF DFF0 <unassigned> Supplementary Private Use Area-A
U+10FFFD 16 DBFF DFFD <unassigned> Supplementary Private Use Area-B