16.10 - Session Character Sets - Access Module

Teradata Tools and Utilities Access Module Reference

prodname
Access Module
vrm_release
16.10
created_date
July 2017
category
Programming Reference
featnum
B035-2425-077K

All Teradata utilities (except for BTEQ) notify the Teradata Access Module for JMS about the session character set they use. Specify the session character set name for each utility as follows:

  • Teradata FastExport, Teradata MultiLoad, or Teradata TPump – Specify the -c character-set-name parameter at runtime in the command line to set the session character set name.
  • BTEQ or Teradata FastLoad – Specify the .SET SESSION CHARSET command in the script.
  • Teradata PT – Specify the USING CHARACTER SET <charset-id> option preceding the DEFINE JOB statement in the script.

The Teradata Access Module for JMS uses a fixed mapping of Teradata session character sets to Java character sets. This mapping is a default properties file, called charsets.properties, in the installation folder of the access module. For a given Teradata session character set, the corresponding Java character set is used to encode bytes sent to the message queue system, and to decode bytes received from the queue system. You can add new character sets to the properties file and edit existing mappings.

Character Set Mapping 
Teradata Session Character Set Java Character Set
ASCII ASCII
UTF-8 UTF-8
UTF-16 UnicodeBigUnmarked (for big-endian platforms)
UTF-16 UnicodeLittleUnmarked (for little-endian platforms)
EBCDIC037_0E Cp037
EBCDIC273_0E Cp273
EBCDIC277_0E Cp277
HANGULEBCDIC933_1II Cp933
HANGULKSC5601_2R4 MS949
KANJIEBCDIC5026_0I Cp930
KANJIEBCDIC5035_0I Cp939
KANJIEUC_0U EUC_JP
KANJISJIS_0S MS932
KATAKANAEBCDIC CP930
LATIN1_0A ISO8859_1
LATIN1252_0A Cp1252
LATIN9_0A ISO8859_15_FDIS
SCHEBCDIC935_2IJ Cp935
SCHGB2312_1T0 EUC_CN
TCHBIG5_1R0 BIG5
TCHEBCDIC937_3IB Cp937

If ASCII is specified as the character set, the Teradata Access Module for JMS uses the platform default encoding. If that default encoding cannot represent certain characters in a platform-specific byte sequence, character conversions might produce minor differences. If an exact match is required, use the UTF-8 session character set to avoid conversions between character sets.