UTF-8 and UTF-16 Character Sets

Teradata MultiLoad Reference

brand
Teradata Tools and Utilities
prodname
MultiLoad
vrm_release
16.10
category
Programming Reference
featnum
B035-2409-057K

Unicode character sets UTF-8 and UTF-16 are two of the standard ways of encoding Unicode character data.

The UTF-8 client character set supports UTF-8 encoding and UTF-16 client character set supports UTF-16 encoding.

Teradata Database supports multi-byte characters in object names when UTF-8 and UTF-16 client character sets are used. If multi-byte characters are used in object names in Teradata MultiLoad script, they must be enclosed in double quotes.

Do not use the TABLE command when using UTF-8 and UTF-16 client character sets. Instead, specify the layout of the input record.

There are restrictions imposed by Teradata Database on using the UTF-8 or UTF-16 character set. See International Character Set Support (B035-1125) for restriction details.

UTF-8 Character Sets

Teradata MultiLoad supports the UTF-8 character set on network-attached platforms and IBM z/OS. When using UTF-8 client character set on IBM z/OS, the job script must be in Teradata EBCDIC. Teradata MultiLoad translates commands in the job script from Teradata EBCDIC to UTF-8 during the load.

Before using the UTF-8 client character set on a mainframe platform, check the character set definition to determine the code points and the Teradata EBCDIC and Unicode character mapping. Different versions of EBCDIC do not always agree as to the placement of any special characters required in the job script. See International Character Set Support (B035-1125) for details. For more information on using the UTF-8 client character set on mainframe platforms, see:

  • nullexpr and fieldexpr command parameters for the FIELD command in FIELD
  • VARTEXT format delimiter and WHERE condition for the IMPORT command in IMPORT
  • CONTINUEIF condition for the LAYOUT command in LAYOUT

UTF-16 Character Sets

Teradata MultiLoad supports the UTF-16 character set on network-attached platforms. In general, the command language and the job output are the same as the client character set used by the job. However, the command language and the job output are not required to be the same as the client character set when using a UTF-16 character set. When using a UTF-16 character set, the job script and the job output can be either UTF-8 or UTF-16 character set, which is specified by the run-time parameters ā€œ-iā€ and ā€œ-uā€ when the job is invoked.

For more information on the run-time parameters, see the parameters -i scriptencoding and -u outputencoding in Run-time Parameters for Network-Attached Configurations. For more information on using the UTF-16 client character set, see:

  • nullexpr and fieldexpr command parameters for the FIELD command in FIELD
  • WHERE condition for the IMPORT command in IMPORT
  • CONTINUEIF condition for the LAYOUT command in LAYOUT