16.10 - STATEMACHINE Statement - Teradata Database

Teradata Database International Character Set Support

prodname
Teradata Database
vrm_release
16.10
created_date
June 2017
category
Configuration
User Guide
featnum
B035-1125-161K

The mapping file for a multibyte client character set must contain the following statement:

#STATEMACHINE  smachine

where the client character set name determines the value of smachine.

IF the character set name is… THEN the value of smachine is… AND the mapping file provides translation tables for a character set that uses this encoding form…
  • SDSCHEBCDIC935_6IJ
  • SDTCHEBCDIC937_7IB
  • SDKATAKANAEBCDIC_4IF
  • SDKANJIEBCDIC5026_4IG
  • SDKANJIEBCDIC5035_4IH
  • SDHANGULEBCDIC933_5II
SOSI0E0F EBCDIC Shift-Out/Shift-In.

Shift-out character 0x0E and shift-in character 0x0F bracket zero or more double-byte characters.

  • SDTCHBIG5_3R0
  • SDHANGULKSC5601_4R4
S81 The value of first byte in the sequence, which distinguishes single-byte characters from double-byte characters.

If the value of the first byte is:

  • less than 0x81, the length of the character is one byte
  • equal to or greater than 0x81, the length of the character is 2 bytes
S80 is no longer supported. S80 behaves like S81. You should change your map files to reflect this change (distributed map files are changed by Teradata).
SDSCHGB2312_2T0 EUC1211 Extended UNIX Code (EUC), composed of two code sets: cs0 for single-byte characters and cs1 for double-byte characters
SDKANJISJIS_1S3 S80A1E0 The value of first byte in the sequence, which distinguishes single-byte characters from double-byte characters.

If the value of the first byte is:

  • less than 0x81, the length of the character is one byte
  • equal to or greater than 0x81, the length of the character is 2 bytes
  • greater than or equal to 0xA1 and less than 0xE0, the length of the character is one byte
  • greater than or equal to 0xE0, the length of the character is 2 bytes
SDKANJIEUC_1U3 EUC1223 The encoding form is Extended UNIX Code (EUC), composed of four code sets: cs0 for one-byte characters, cs1 and cs2 for two-byte characters, and cs3 for three-byte characters.