Unwanted Export Truncation - Teradata Database

International Character Set Support

Product
Teradata Database
Release Number
15.10
Language
English (United States)
Last Update
2018-09-25
dita:id
B035-1125
lifecycle
previous
Product Category
Teradata® Database

While the defaults are set to avoid this in most cases, it is possible for the EXPORT facility to produce unwanted data truncation.

Under the expected-defaults, this can only occur in the following circumstances:

  • During a KanjiEUC session where cs3 characters exist in a string without a corresponding number of cs0 characters.
  • During a KanjiEBCDIC session where a string exists with highly interspersed single-byte and multibyte characters.
  • The examples that follow illustrate how unwanted truncation can occur during the export process.

    Note that you can avoid these types of truncation by using the CAST function. For information, see “CAST Function” in SQL Functions, Operators, Expressions, and Predicates.

    For example, consider the following SELECT statement from a KanjiEUC session.

       SELECT 'Âgé';

    The string Âgé is translated internally to UCS2.

     

    00

    C2

    00

    67

    00

    E9

    Â

    g

    é

    Because this is three characters, the expected-defaults exports at most six bytes, but the representation of Âgé in KanjiEUC requires seven bytes.

     

    8F

    AA

    A4

    67

    8F

    AB

    B1

    Â

    g

    é

    Therefore, the output must be truncated to six bytes.

    Truncation occurs at a character boundary, so the output is truncated to Âg, which requires four bytes.

    If this were a fixed width field, these four bytes would be padded with SPACE characters to produce six bytes. Because the field is not fixed width, only four bytes are output (excluding the length information which is always included in variable width data).

     

    8F

    AA

    A4

    67

    Â

    g

    For KanjiEBCDIC, the problem occurs when a single-byte character is placed between two sequences of multibyte characters.

    For example, consider the following SELECT statement.

       SELECT '1';

    1, where the 1 is a single-byte character, is translated internally to UCS2.

     

    5E

    73

    62

    10

    00

    31

    5E

    74

    1

    Because this is four characters, the expected-defaults exports at most ten bytes, but the representation of 1 in KanjiEBCDIC requires eleven bytes.

     

    0E

    45

    8D

    45

    BA

    0F

    31

    0E

    45

    60

    0F

    <

    >

    1

    <

    >

    Therefore the output must be truncated to ten bytes.

    Truncation occurs at a character boundary, so the output is truncated to 1, which requires seven bytes.

    If this were a fixed width field, these seven bytes would be padded with SPACE characters to produce ten bytes. Because the field is not fixed width, only seven bytes are output (excluding the length information which is always included in variable width data).

     

    0E

    45

    8D

    45

    BA

    0F

    31

    <

    >

    1