15.10 - Character Set Support - Access Module

Teradata Tools and Utilities Access Module Reference

prodname
Access Module
vrm_release
15.10
category
Programming Reference
featnum
B035-2425-035K

Character Set Support

The following session character sets are supported for transferring data:

  • UTF‑8: Preferred character set because it accommodates a superset of characters handled by the other character sets.
  • UTF16: Ensure data source and OLE DB provider used to connect source/destination database supports UNICODE characters while using UTF‑16 session character set.
  • ASCII: The Teradata OLE DB Access Module uses the code page of the system locale (also called the system default ANSI code page) when using ASCII. Teradata’s ASCII character set does not match Microsoft’s code page, so character conversions produce minor differences. If an exact match is required, use the UTF‑8 session character set.
  • Note: Previous versions of the Teradata OLE DB Access Module used the ANSI‑Latin1 (1252) code page when the ASCII session character set was used, regardless of the system locale.

  • KANJISJIS_0S: Teradata’s Kanji_SJIS character set does not exactly match Microsoft’s Japanese Shift‑JIS code page, so character conversions produce minor differences. If an exact match is required, use the UTF‑8 session character set.
  • LATIN1252_0A: All UNICODE character strings transferred to the Teradata Database are converted to ANSI‑Latin 1 by the Teradata OLE DB Access Module before being passed to a Teradata utility.
  • ANSI‑Latin1 (1252) Code Page

    When the ASCII character set is specified, the Teradata OLE DB Access Module uses the code page of the system locale. If the code page of the system locale is one in which some characters consume more than one byte, the Teradata OLE DB Access Module can return longer character fields during execution of a load job than if the 1252 code page were used. Teradata OleLoad generates load job scripts that account for this change.

    Update any existing scripts. For example, a system locale of “Chinese (Taiwan)” has a code page of 950, which has a maximum character size of 2 B. If existing FastLoad job scripts using the ASCII session character set to load character data from the Teradata OLE DB Access Module, update the scripts to account for the fact that there may be 2 B per character. For example, double the value of <n> in CHAR(<n>) and VARCHAR(<n>) fields located in the CREATE TABLE and DEFINE statements included in the FastLoad job scripts.

    Session Character Sets

    All Teradata utilities notify the Teradata OLE DB Access Module about the session character set they use.

    To set session character sets, specify the character set name as follows:

  • For jobs launched from the Teradata OleLoad GUI, specify the session character set in the Advanced Settings dialog box.
  • For jobs launched without the Teradata OleLoad GUI, specify a character set name as follows:
  • Teradata FastExport, Teradata MultiLoad, or Teradata TPump – Specify the ‑c character‑set‑name parameter at runtime in the command line to set the session character set name.
  • BTEQ or Teradata FastLoad – Specify the .SET SESSION CHARSET command in a script.
  • Teradata PT – Specify the USING CHAR(ACTER) SET charset‑id phrase in the Teradata PT job script.