15.00 - Unicode® Character Sets - FastLoad

Teradata FastLoad Reference

Programming Reference

Unicode® Character Sets

UTF‑8 and UTF‑16 are two of the standard ways of encoding Unicode character data. The UTF‑8 client character set supports UTF‑8 encoding. Currently Teradata Database supports UTF‑8 characters that can consist of from one to three bytes. The UTF‑16 client character set supports UTF‑16 encoding. Currently, the Teradata Database supports the Unicode 5.1 standard, where each defined character requires exactly 16 bits.

For restrictions imposed by Teradata Database on the use of UTF‑8 or UTF‑16 character set, see International Character Set Support (B035‑1132).

UTF8 Character Sets

Teradata FastLoad supports UTF‑8 character set on network‑attached platforms and IBM z/OS.

On IBM z/OS, the job script must be in Teradata EBCDIC when using UTF‑8 client character set. Teradata FastLoad translates commands in the job script from Teradata EBCDIC to UTF‑8 during the load. Be sure to examine the definition in International Character Set Support (B035‑1132) to determine the code points of any special characters which might be required in the job script. Different versions of EBCDIC do not always agree as to the placement of these characters. Refer to the mappings between Teradata EBCDIC and Unicode in International Character Set Support (B035‑1132).

UTF16 Character Sets

Teradata FastLoad supports UTF‑16 character set on network‑attached platforms. In general, the command language and the job output should be the same as the client character set used by the job. However, for user’s convenience and because of the special property of Unicode, the command language and the job output are not required to be the same as the client character set when using UTF‑16 character set. When using UTF‑16 character set, the job script and the job output can either be in UTF‑8 or UTF‑16 character set. This is provided by specifying runtime parameters “‑i” and “‑u” when the job is invoked.