UTF-8 and UTF-16 Character Sets - MultiLoad

Teradata® MultiLoad Reference

Product
MultiLoad
Release Number
17.00
Published
June 2020
Language
English (United States)
Last Update
2020-06-18
dita:mapPath
aim1544831946660.ditamap
dita:ditavalPath
gyk1507317446489.ditaval
dita:id
B035-2409
lifecycle
previous
Product Category
Teradata Tools and Utilities

Unicode character sets UTF-8 and UTF-16 are two of the standard ways of encoding Unicode character data.

The UTF-8 client character set supports UTF-8 encoding and UTF-16 client character set supports UTF-16 encoding.

The database supports multi-byte characters in object names when UTF-8 and UTF-16 client character sets are used. If multi-byte characters are used in object names in Teradata MultiLoad script, they must be enclosed in double quotes.

Do not use the TABLE command when using UTF-8 and UTF-16 client character sets. Instead, specify the layout of the input record.

There are restrictions imposed by the database on using the UTF-8 or UTF-16 character set. See Teradata Vantage™ - Advanced SQL Engine International Character Set Support, B035-1125 for restriction details.

UTF-8 Character Sets

Teradata MultiLoad supports the UTF-8 character set on workstation-attached platforms and IBM z/OS. When using UTF-8 client character set on IBM z/OS, the job script must be in Teradata EBCDIC. Teradata MultiLoad translates commands in the job script from Teradata EBCDIC to UTF-8 during the load.

Before using the UTF-8 client character set on a mainframe platform, check the character set definition to determine the code points and the Teradata EBCDIC and Unicode character mapping. Different versions of EBCDIC do not always agree as to the placement of any special characters required in the job script. See Teradata Vantage™ - Advanced SQL Engine International Character Set Support, B035-1125 for details. For more information on using the UTF-8 client character set on mainframe platforms, see:

  • nullexpr and fieldexpr command parameters for the FIELD command in FIELD
  • VARTEXT format delimiter and WHERE condition for the IMPORT command in IMPORT
  • CONTINUEIF condition for the LAYOUT command in LAYOUT

UTF-16 Character Sets

Teradata MultiLoad supports the UTF-16 character set on workstation-attached platforms. In general, the command language and the job output are the same as the client character set used by the job. However, the command language and the job output are not required to be the same as the client character set when using a UTF-16 character set. When using a UTF-16 character set, the job script and the job output can be either UTF-8 or UTF-16 character set, which is specified by the run-time parameters “-i” and “-u” when the job is invoked.

For more information on the run-time parameters, see the parameters -i scriptencoding and -u outputencoding in Run-time Parameters for Workstation-Attached Configurations. For more information on using the UTF-16 client character set, see:
  • nullexpr and fieldexpr command parameters for the FIELD command in FIELD
  • WHERE condition for the IMPORT command in IMPORT
  • CONTINUEIF condition for the LAYOUT command in LAYOUT