Unicode Character Sets - FastExport

Teradata® FastExport Reference

Product
FastExport
Release Number
16.20
Published
September 2020
Language
English (United States)
Last Update
2020-09-11
dita:mapPath
lki1527114222329.ditamap
dita:ditavalPath
obe1474387269547.ditaval
dita:id
B035-2410
lifecycle
previous
Product Category
Teradata Tools and Utilities

UTF-8 and UTF-16 are two of the standard ways of encoding Unicode character data. The UTF-8 client character set supports UTF-8 encoding. Currently, Teradata Database supports UTF-8 characters that can consist of from one to three bytes. The UTF-16 client character set supports UTF-16 encoding. Currently, Teradata Database supports the Unicode 2.1 standard, where each defined character requires exactly 16 bits.

There are restrictions imposed by Teradata Database on using the UTF-8 or UTF-16 character set. For restriction details, see International Character Set Support (B035-1125).

UTF-8 Character Sets

FastExport supports UTF-8 character set on network-attached platforms and IBM z/OS. When using UTF-8 client character set on IBM z/OS, the job script must be in Teradata EBCDIC. FastExport translates commands in the job script from Teradata EBCDIC to UTF-8 during the export.

Be sure to check the definition in International Character Set Support (B035-1125) to determine the code points of any special characters required in the job script.

Different versions of EBCDIC do not always agree as to the placement of these characters. See International Character Set Support (B035-1125) for details on mapping Teradata EBCDIC and Unicode.

UTF-16 Character Sets

FastExport supports UTF-16 character set on network-attached platforms. In general, the command language and the job output should be the same as the client character set used by the job. However, for users’ convenience and because of the special property of Unicode, the command language and the job output are not required to be the same as the client character set when using UTF-16 character set. When using UTF-16 character set, the job script and the job output can either be in UTF-8 or UTF-16 character set. This is provided by specifying runtime parameters “-i” and “-u” when the job is invoked.

For more information on runtime parameters “-i” and “-u”, see Runtime Parameters for Network-Attached Systems.

The following table describes four ways to either specify the character set or accept a default specification.

Methods for Specifying Character Sets 
Method Description
Client System Specification Another way is to specify the character set for a client system before invoking FastExport by configuring the:
  • HSHSPB parameter for mainframe-attached z/OS client systems
  • clispb.dat file for network-attached UNIX and Windows client systems
The character-set-name specification used when to invoke FastExport always takes precedence over the current client system specification.
FastExport Utility Default If there is no character set specification in DBC.Hosts, then FastExport defaults to:
  • EBCDIC for mainframe-attached z/OS client systems
  • ASCII for network-attached UNIX client systems
Runtime Parameter Specification The best way to specify the character set is with the character set runtime parameter when invoking FastExport (see Invoking FastExport):
  • CHARSET=character-set-name for mainframe-attached z/OS client systems
  • -c character-set-name for network-attached UNIX and Windows client systems

For a list of valid character set names, see Character Set Specification.

Teradata Database Default If a character-set-name specification is not used when FastExport is invoked, and there is no character set specification for the client system, then the utility uses the default specification in the Teradata Database system table DBC.Hosts.
If the DBC.Hosts table specification is relied upon for the default character set, make sure that the initial logon is in the default character set:
  • EBCDIC for mainframe-attached z/OS client systems
  • ASCII for network-attached UNIX and Windows client systems