Teradata Database allows a character set to be established when invoking FastExport. For example, if a table or database names that have Kanji double‑byte characters or mixed single‑byte and multibyte characters, the appropriate character set can be chosen.
Table 9 lists the standard character sets supported by FastExport.
Name |
Description |
System Configuration |
EBCDIC |
Latin |
Mainframe‑attached |
ASCII |
Latin |
Network‑attached |
HANGULEBCDIC933_1II |
Korean |
Mainframe‑attached |
HANGULKSC5601_2R4 |
Korean |
Network‑attached |
KATAKANAEBCDIC |
Japanese |
Mainframe‑attached |
KANJIEBCDIC5026_0I |
Japanese |
Mainframe‑attached |
KANJIEBCDIC5035_0I |
Japanese |
Mainframe‑attached |
KANJIEUC_0U |
Japanese |
Network‑attached |
KANJISJIS_0S |
Japanese |
Network‑attached |
SCHEBCDIC935_2lJ |
Simplified Chinese |
Mainframe‑attached |
SCHGB2312_1T0 |
Simplified Chinese |
Network‑attached |
TCHBIG5_1R0 |
Traditional Chinese |
Network‑attached |
TCHEBCDIC937_3IB |
Traditional Chinese |
Mainframe‑attached |
UTF8 |
Unicode® character set |
Network‑attached |
UTF‑8 |
Unicode character set |
Mainframe‑attached Network‑attached |
UTF16 |
Unicode character set |
Network‑attached |
UTF‑16 |
Unicode character set |
Network‑attached |
Site‑Defined Character Sets
When the character sets defined are not appropriate for a site, define the character sets shown in Table 10.
Name |
Description |
System Configuration |
SDKATAKANAEBCDIC_4IF |
Site‑defined Japanese |
Mainframe‑Attached |
SDKANJIEBCDIC5026_4IG |
Site‑defined Japanese |
Mainframe‑Attached |
SDKANJIEBCDIC5035_4IH |
Site‑defined Japanese |
Mainframe‑Attached |
SDKANJIEUC_1U3 |
Site‑defined Japanese |
Network‑Attached |
SDKANJISJIS_1S3 |
Site‑defined Japanese |
Network‑Attached |
SDSCHEBCDIC935_6IJ |
Site‑defined Simplified Chinese |
Mainframe‑Attached |
SDTCHEBCDIC937_7IB |
Site‑defined Traditional Chinese |
Mainframe‑Attached |
SDSCHGB2312_2T0 |
Site‑defined Simplified Chinese |
Network‑Attached |
SDTCHBIG5_3R0 |
Site‑defined Traditional Chinese |
Network‑Attached |
SDHANGULEBCDIC933_5II |
Site‑defined Korean |
Mainframe‑Attached |
SDHANGULKSC5601_4R4 |
Site‑defined Korean |
Network‑Attached |
Note: For information about defining a character set appropriate for a site, see International Character Set Support (B035‑1132).
Rules for Using Chinese and Korean Character Sets
Observe the following rules when using Chinese and Korean character sets on mainframe‑attached and network‑attached platforms:
Object names are limited to A‑Z, a‑z, 0‑9, and special characters such as $ and _.
Teradata Database requires two bytes to process each of the Chinese or Korean characters. This limits both request size and record size. For example, if a record consists of one string, the length of that string is limited to a maximum of 32,000 characters or 64,000 bytes.
Note: For more information about Chinese or Korean character set restrictions for Teradata Database, or for more information about alternate character sets, see International Character Set Support (B035‑1132).
If Japanese language support is not required, specify EBCDIC or ASCII as the character set parameter.
Unicode® Character Sets
UTF‑8 and UTF‑16 are two of the standard ways of encoding Unicode character data. The UTF‑8 client character set supports UTF‑8 encoding. Currently, Teradata Database supports UTF‑8 characters that can consist of from one to three bytes. The UTF‑16 client character set supports UTF‑16 encoding. Currently, Teradata Database supports the Unicode 2.1 standard, where each defined character requires exactly 16 bits.
There are restrictions imposed by Teradata Database on using the UTF‑8 or UTF‑16 character set. For restriction details, see International Character Set Support (B035‑1132).
FastExport supports UTF‑8 character set on network‑attached platforms and IBM z/OS. When using UTF‑8 client character set on IBM z/OS, the job script must be in Teradata EBCDIC. FastExport translates commands in the job script from Teradata EBCDIC to UTF‑8 during the export.
Be sure to check the definition in International Character Set Support (B035‑1132) to determine the code points of any special characters required in the job script.
Different versions of EBCDIC do not always agree as to the placement of these characters. See International Character Set Support (B035‑1132) for details on mapping Teradata EBCDIC and Unicode.
FastExport supports UTF‑16 character set on network‑attached platforms. In general, the command language and the job output should be the same as the client character set used by the job. However, for users’ convenience and because of the special property of Unicode, the command language and the job output are not required to be the same as the client character set when using UTF‑16 character set. When using UTF‑16 character set, the job script and the job output can either be in UTF‑8 or UTF‑16 character set. This is provided by specifying runtime parameters “‑i” and “‑u” when the job is invoked.
For more information on runtime parameters “‑i” and “‑u”, see Table 6 on page 17.
Table 11 describes four ways to either specify the character set or accept a default specification.
Method |
Description |
Client System Specification |
Another way is to specify the character set for a client system before invoking FastExport by configuring the: Note: The character‑set‑name specification used when to invoke FastExport always takes precedence over the current client system specification. |
FastExport Utility Default |
If there is no character set specification in DBC.Hosts, then FastExport defaults to: |
Runtime Parameter Specification |
The best way to specify the character set is with the character set runtime parameter when invoking FastExport, as described earlier in this chapter: For a list of valid character set names, see “Character Set Specification” on page 31. |
Teradata Database Default |
If a character‑set‑name specification is not used when FastExport is invoked, and there is no character set specification for the client system, then the utility uses the default specification in the Teradata Database system table DBC.Hosts. Note: If the DBC.Hosts table specification is relied upon for the default character set, make sure that the initial logon is in the default character set: |
Using AXSMOD
When an AXSMOD is used, FastExport will pass the session character set as an attribute to the AXSMOD for its possible use (most AXSMODs will not make any use of this information). The attribute name will be CHARSET_NAME and it will be a variable length character string.
After FastExport passes the session character set to the AXSMOD successfully, FastExport will pass export widths information that pertains to the current session character set as an attribute to the AXSMOD for its possible use. The attribute name is EXPORT_WIDTHS. FastExport extracts the export widths information from the data parcel returned by the HELP SESSION command.
The export width information is passed as an array to the AXSMOD and is used by the AXSMOD to calculate the size in bytes of exported fixed‑length character columns. This size depends not only on the number of characters in the data type (the n in CHAR(n)), but also on the selected session character set, and the server character type (specified in the CHARACTER SET clause of the CREATE TABLE statement). Each structure passed in the array has information for one server character type. The export widths information structure is defined as the following:
typedef struct pmExpWidth
{
pmUInt16 CharType; /* Server character type code. */
pmUInt16 ExpWidth; /* Export width. */
pmUInt16 ExpWidthAdj; /* Export width adjustment. */
} pmExpWidth_t;
For more information about export width rules, see Utilities (B035‑1102).
Multibyte Character Sets
Teradata Database supports multibyte characters in object names when the client session character set is UTF‑8 or UTF‑16. Refer to International Character Set Support for a list of valid characters used in object names. If multi-byte characters are used in object names in a Teradata FastExport script, they must be enclosed in double quotes.
To log on with UTF‑8 character set or other supported multibyte character sets (Chinese, Japanese, or Korean), create object names shorter than 30 bytes. This limitation applies to userid, password, and account. The logon string might fail if it exceeds 30 bytes per object name.
Multibyte character sets impact the operation of certain FastExport commands, as well as object names in Teradata SQL statements, as shown in Table 12.
FastExport Command |
Affected Elements |
Impact |
FIELD |
Field name |
The field name specified can have multibyte characters. In addition, it can be referenced in: |
FILLER |
Filler name |
The name specified in a FILLER command can have multibyte characters. |
LAYOUT |
Layout name |
The layout name can: |
|
CONTINUEIF condition |
The CONTINUEIF condition can specify multibyte character set character comparisons. |
LOGON |
User name and password |
The user name and password can have multibyte characters. |
LOGTABLE |
Table and database names |
The restart log table name and database name can have multibyte characters. |