Unicode Character Sets

Unicode Character Sets - FastExport

Teradata® FastExport Reference

Product

FastExport

Release Number

16.20

Published

September 2020

Language

English (United States)

Last Update

2020-09-11

dita:mapPath

lki1527114222329.ditamap

dita:ditavalPath

obe1474387269547.ditaval

dita:id

B035-2410

lifecycle

Product Category

Teradata Tools and Utilities

UTF-8 and UTF-16 are two of the standard ways of encoding Unicode character data. The UTF-8 client character set supports UTF-8 encoding. Currently, Teradata Database supports UTF-8 characters that can consist of from one to three bytes. The UTF-16 client character set supports UTF-16 encoding. Currently, Teradata Database supports the Unicode 2.1 standard, where each defined character requires exactly 16 bits.

There are restrictions imposed by Teradata Database on using the UTF-8 or UTF-16 character set. For restriction details, see International Character Set Support (B035-1125).

UTF-8 Character Sets

FastExport supports UTF-8 character set on network-attached platforms and IBM z/OS. When using UTF-8 client character set on IBM z/OS, the job script must be in Teradata EBCDIC. FastExport translates commands in the job script from Teradata EBCDIC to UTF-8 during the export.

Be sure to check the definition in International Character Set Support (B035-1125) to determine the code points of any special characters required in the job script.

Different versions of EBCDIC do not always agree as to the placement of these characters. See International Character Set Support (B035-1125) for details on mapping Teradata EBCDIC and Unicode.

UTF-16 Character Sets

FastExport supports UTF-16 character set on network-attached platforms. In general, the command language and the job output should be the same as the client character set used by the job. However, for users’ convenience and because of the special property of Unicode, the command language and the job output are not required to be the same as the client character set when using UTF-16 character set. When using UTF-16 character set, the job script and the job output can either be in UTF-8 or UTF-16 character set. This is provided by specifying runtime parameters “-i” and “-u” when the job is invoked.

For more information on runtime parameters “-i” and “-u”, see Runtime Parameters for Network-Attached Systems.

The following table describes four ways to either specify the character set or accept a default specification.

Methods for Specifying Character Sets
Method	Description
Client System Specification	Another way is to specify the character set for a client system before invoking FastExport by configuring the: HSHSPB parameter for mainframe-attached z/OS client systems clispb.dat file for network-attached UNIX and Windows client systems The character-set-name specification used when to invoke FastExport always takes precedence over the current client system specification.
FastExport Utility Default	If there is no character set specification in DBC.Hosts, then FastExport defaults to: EBCDIC for mainframe-attached z/OS client systems ASCII for network-attached UNIX client systems
Runtime Parameter Specification	The best way to specify the character set is with the character set runtime parameter when invoking FastExport (see Invoking FastExport): CHARSET=character-set-name for mainframe-attached z/OS client systems -c character-set-name for network-attached UNIX and Windows client systems For a list of valid character set names, see Character Set Specification.
Teradata Database Default	If a character-set-name specification is not used when FastExport is invoked, and there is no character set specification for the client system, then the utility uses the default specification in the Teradata Database system table DBC.Hosts. If the DBC.Hosts table specification is relied upon for the default character set, make sure that the initial logon is in the default character set: EBCDIC for mainframe-attached z/OS client systems ASCII for network-attached UNIX and Windows client systems