The mapping file for a multibyte client character set contains single-byte and multibyte translation tables that define the mapping between an internal transition form and Unicode for the character set.
The single-byte translation tables are analogous to the E2I and I2E fields in the translation tables in DBC.CharTranslationV, which define the conversion between the client character set and the internal transition form.
The mapping file for a multibyte client character set must contain the following statement:
#STATEMACHINE smachine
where the client character set name determines the value of smachine.
IF the character set name is… |
THEN the value of smachine is… |
AND the mapping file provides translation tables for a character set that uses this encoding form… |
|
SOSI0E0F |
EBCDIC Shift-Out/Shift-In. Shift-out character 0x0E and shift-in character 0x0F bracket zero or more double-byte characters. |
|
S81 |
The value of first byte in the sequence, which distinguishes single-byte characters from double-byte characters. If the value of the first byte is: Note: S80 is no longer supported. S80 behaves like S81. You should change your map files to reflect this change (distributed map files are changed by Teradata). |
SDSCHGB2312_2T0 |
EUC1211 |
Extended UNIX Code (EUC), composed of two code sets: cs0 for single-byte characters and cs1 for double-byte characters |
SDKANJISJIS_1S3 |
S80A1E0 |
The value of first byte in the sequence, which distinguishes single-byte characters from double-byte characters. If the value of the first byte is: |
SDKANJIEUC_1U3 |
EUC1223 |
The encoding form is Extended UNIX Code (EUC), composed of four code sets: cs0 for one-byte characters, cs1 and cs2 for two-byte characters, and cs3 for three-byte characters. |
Translation tables map characters between an extended site-defined client character set and Unicode.
A mapping file must minimally provide translation tables that map characters from the client character set to Unicode.
A mapping file may optionally provide translation tables that map characters from Unicode to the client character set. If the optional translation tables are not defined, the system derives them by inverting the corresponding mandatory tables.
If one of the following conditions exists, however, the optional translation tables become mandatory:
Each translation table starts with the following statement:
#BEGINMAP table_name
and ends with the following statement:
#ENDMAP table_name
where:
The value of table_name is determined by the client character set name.
IF the character set name is … |
THEN the mapping file must define these tables … |
AND optionally define these tables … |
SDHANGULEBCDIC933_5II |
|
|
SDHANGULKSC5601_4R4 |
|
|
SDKANJIEBCDIC5026_4IG |
|
|
SDKANJIEBCDIC5035_4IH |
|
|
SDKANJIEUC_1U3 |
|
|
SDKANJISJIS_1S3 |
|
|
SDKATAKANAEBCDIC_4IF |
|
|
SDSCHEBCDIC935_6IJ |
|
|
SDSCHGB2312_2T0 |
|
|
SDTCHBIG5_3R0 |
|
|
SDTCHEBCDIC937_7IB |
|
|
A mapping file named map_4R has the following statement:
#STATEMACHINE S81
and defines the following translation tables:
4R_SBC_2_UNICODE
4R_MBC_2_UNICODE
and optionally defines the following translation tables:
UNICODE_2_4R_SBC
UNICODE_2_4R_MBC
The format of the mapping file is multiple lines, with each line terminated by a linefeed character. This may be problematic for editors that expect carriage-return or carriage-return followed by linefeed to terminate a line.
Note: Linefeed termination is the UNIX convention. Carriage-return linefeed is the Windows convention.
Use the # to start a comment that continues to the end of a line. Blank lines are ignored.