Session Character Sets - OLE DB Provider for Teradata

OLE DB Provider for Teradata User Guide

Product
OLE DB Provider for Teradata
Release Number
15.00
Language
English (United States)
Last Update
2018-09-28
dita:id
B035-2498
Product Category
Teradata Tools and Utilities

Session Character Sets

During initialization, the following rules determine what session character set is actually used:

  • If a session character set name was written to TDPROP_INIT_SESSIONCHARACTERSET, then that is used.
  • Otherwise, if a session character set name is specified for the Session Character Set attribute in the string written to the extended properties (DBPROP_INIT_PROVIDERSTRING), then that is used. See the Session Character Set attribute in the “List of Attributes” on page 44.
  • Otherwise, if the TDOLEDB_DEFAULT_SESSION_CHARACTER_SET environment variable value (in the process in which this instance of the provider is running) is the name of one of the four supported session character sets, then that is used.
  • Otherwise, if DBPROP_INIT_LCID contains the locale ID for a Japanese locale, then KANJISJIS_0S is used.
  • Otherwise, ASCII is used.
  • When choosing a session character set consider the following:

  • The UTF16 and UTF8 session character sets are encodings of the Unicode character set. The Unicode character set includes the characters needed for virtually all languages in modern use.
    The UTF16 and UTF8 session character sets support the identical set of characters. All Unicode characters that do not require surrogate code points are supported. All supported characters consume two bytes per character in the UTF16 session character set and one to three bytes per character in the UTF8 session character set. The set of characters supported by the UTF16 and UTF8 session character sets are supersets of the characters supported by the ASCII and KANJISJIS_0S session character sets.
  • ASCII, UTF16, and UTF8 are permanently enabled session character sets supported by Teradata Database. KANJISJIS_0S must be made active on Teradata Database in order to be available for use.
  • When the VARCHAR data being accessed chiefly consists of ASCII characters, performance may be better by using the UTF8 session character set (as opposed to UTF16) because many of the characters will consume only a single byte. However, when using the UTF8 session character set, communications of fixed-length character values (i.e. CHAR(n)) between the database and OLE DB Provider for Teradata always reserve three bytes per character, so when the data being access is chiefly CHAR(n), better performance may be observed when using the UTF16 session character set.
  • When the ASCII session character set is used, OLE DB Provider for Teradata converts character data using the application code page.
  • (e) When the KANJISJIS_0S session character set is used, OLE DB Provider for Teradata converts character data using the Microsoft Windows code page 932 (Japanese Shift-JIS).
  • For more information on session character sets, see International Character Set Support.