Unicode Pass Through is a Unicode error handling enhancement. The feature gives users the ability to allow Pass Through Characters (PTCs) to be imported into and exported from Teradata. PTCs are the set of all possible Unicode characters minus the 6.0 BMP characters supported by Teradata and the set of noncharacters. PTCs include the following:
- Characters not supported by Teradata:
- Characters in BMP from Unicode versions 6.1.0 to 9.0.0
- Characters in SMP from all Unicode versions
- Unassigned characters
- Private use characters
The feature is enabled or disabled at the session level and is valid only for Unicode sessions. UTF8 or UTF16 sessions with the Unicode Pass Through feature enabled will allow PTCs to be used in queries and to be stored in or retrieved from Unicode columns. Noncharacters and characters with an invalid encoding form are changed to the REPLACEMENT CHARACTER (U+FFFD) which can pass through and be stored in Teradata with this feature.
By default, this feature is not enabled. In sessions where the feature is not enabled, attempts to input any unsupported 6.0 BMP Unicode code points into Teradata results in an error and the query fails.
Benefits
- Increases the repertoire of characters available to Unicode users from 61K to 1,112K, including Emoji and other ideographs that previously could not be stored in Teradata.
- Reduces or eliminates cases where translation errors require users to cleanse their Unicode data before submitting it to Teradata.
- Client load filtering is simplified, therefore improving load performance.
- Data is not lost as would be the case when replacement characters are used.
- Enhances Teradata Unicode support to be compatible with Unicode provided by other data sources including other databases.
Considerations
- This feature only applies to Unicode data and sessions, meaning UTF8 and UTF16 sessions and the UNICODE server character set. For example, importing the 0x1A replacement character from an ASCII character set to a Unicode column will still be rejected even in a session enabled with the Unicode Pass Through feature.
- Although this feature allows the storage and retrieval of PTCs, full support is not available for these characters in Teradata. For example, collation, case sensitivity, and object name support are not included with this feature.
- PTCs should not be stored in a column that is part of an index.
- This feature is not supported with standalone utilities such as Fastload and Multiload. You should use Teradata Parallel Transporter to load or unload data containing PTCs.
- Once PTCs are stored in Teradata, they may be difficult to remove. They would have to be deleted or replaced.
- With this feature enabled, users who rely on Teradata to screen out unsupported characters or the REPLACEMENT CHARACTER (U+FFFD) can no longer rely on Teradata to do so.
- This feature allows Teradata to support characters that were not previously supported. Therefore, data on a Teradata system with this feature enabled cannot be transported to a Teradata system that is running on an earlier release.
SQL Changes
SET SESSION CHARACTER SET UNICODE PASS THROUGH [ON|OFF]
Additional Information
For more information on Unicode Passthrough, see:
- Teradata® Database International Character Set Support
- Teradata® Database SQL Data Definition Language - Syntax and Examples