Unicode Character Sets - FastExport

Teradata® FastExport Reference

Product
FastExport
Release Number
17.00
Published
September 30, 2020
Language
English (United States)
Last Update
2020-09-02
dita:mapPath
ups1544831946863.ditamap
dita:ditavalPath
obe1474387269547.ditaval
dita:id
B035-2410
lifecycle
previous
Product Category
Teradata Tools and Utilities

UTF-8 and UTF-16 are two of the standard ways of encoding Unicode character data. The UTF-8 client character set supports UTF-8 encoding. Currently, the database supports UTF-8 characters that can consist of from one to three bytes. The UTF-16 client character set supports UTF-16 encoding. Currently, Vantage supports the Unicode 2.1 standard, where each defined character requires exactly 16 bits.

There are restrictions imposed by Vantage on using the UTF-8 or UTF-16 character set. For restriction details, see Teradata Vantage™ - Advanced SQL Engine International Character Set Support, B035-1125.

UTF-8 Character Sets

FastExport supports UTF-8 character set on workstation-attached platforms and IBM z/OS. When using UTF-8 client character set on IBM z/OS, the job script must be in Teradata EBCDIC. FastExport translates commands in the job script from Teradata EBCDIC to UTF-8 during the export.

Be sure to check the definition in Teradata Vantage™ - Advanced SQL Engine International Character Set Support, B035-1125 to determine the code points of any special characters required in the job script.

Different versions of EBCDIC do not always agree as to the placement of these characters. See Teradata Vantage™ - Advanced SQL Engine International Character Set Support, B035-1125 for details on mapping Teradata EBCDIC and Unicode.

UTF-16 Character Sets

FastExport supports UTF-16 character set on workstation-attached platforms. In general, the command language and the job output should be the same as the client character set used by the job. However, for users’ convenience and because of the special property of Unicode, the command language and the job output are not required to be the same as the client character set when using UTF-16 character set. When using UTF-16 character set, the job script and the job output can either be in UTF-8 or UTF-16 character set. This is provided by specifying runtime parameters “-i” and “-u” when the job is invoked.

For more information on runtime parameters “-i” and “-u”, see Runtime Parameters for Workstation-Attached Systems.

The following table describes four ways to either specify the character set or accept a default specification.

Methods for Specifying Character Sets 
Method Description
Client System Specification Another way is to specify the character set for a client system before invoking FastExport by configuring the:
  • HSHSPB parameter for mainframe-attached z/OS client systems
  • clispb.dat file for workstation-attached UNIX and Windows client systems
The character-set-name specification used when to invoke FastExport always takes precedence over the current client system specification.
FastExport Utility Default If there is no character set specification in DBC.Hosts, then FastExport defaults to:
  • EBCDIC for mainframe-attached z/OS client systems
  • ASCII for workstation-attached UNIX client systems
Runtime Parameter Specification The best way to specify the character set is with the character set runtime parameter when invoking FastExport, as described earlier:
  • CHARSET=character-set-name for mainframe-attached z/OS client systems
  • -c character-set-name for workstation-attached UNIX and Windows client systems

For a list of valid character set names, see Character Set Specification.

Database Default If a character-set-name specification is not used when FastExport is invoked, and there is no character set specification for the client system, then the utility uses the default specification in the database system table DBC.Hosts.
If the DBC.Hosts table specification is relied upon for the default character set, make sure that the initial logon is in the default character set:
  • EBCDIC for mainframe-attached z/OS client systems
  • ASCII for w-attached UNIX and Windows client systems