Mapping File for a Single-Byte Character Set - Teradata Database

International Character Set Support

Product
Teradata Database
Release Number
15.10
Language
English (United States)
Last Update
2018-09-25
dita:id
B035-1125
lifecycle
previous
Product Category
Teradata® Database

The mapping file for a new single-byte client character set, which the user creates, should contain up to four translation tables, analogous to the translation tables in DBC.CharTranslationV.

  • The first table defines the translation from the single-byte transitional form to the double-byte Unicode form.
  • The second table defines the translation from the single-byte transitional form to the corresponding uppercase double-byte Unicode form.
  • Only characters that are mapped differently from the first table should be included.

  • The third table is optional, and allows for the user to create a many-to-one mapping from Unicode to the external character set if desired.
  • If the third table is not present, then an inverse mapping is computed from the translation defined in the first table.

  • The fourth table defines the translation from double-byte Unicode form to the corresponding upper case single-byte internal transitional form. Only characters that are mapped differently from the optional third table (or the inverse mapping computed from the first table if the third table is not present) should be included.
  • The mapping file for a single-byte client character set can optionally contain the statement:

    #STATEMACHINE SBC

    Note: If a mapping file does not specify the STATEMACHINE statement, then the default is STATEMACHINE SBC.

    The STATEMACHINE statement describes the encoding form of the character set. STATEMACHINE SBC means that the character set is a single-byte character set.

    The format of the mapping file is multiple lines, with each line terminated by a linefeed character. This may be problematic for editors that expect carriage-return or carriage-return followed by linefeed to terminate a line.

    Note: Linefeed termination is the UNIX convention. Carriage-return linefeed is the Windows convention.

    Each translation table starts with a #BEGINMAP line and ends with an #ENDMAP line. Other than those two commands, the # indicates the start of a comment that continues to the end of line. Blank lines are ignored.

    The #BEGINMAP and #ENDMAP statements also include the name of the map being defined.

    For an example, see “Example Mapping File” on page 91.