Comparison Rules - Advanced SQL Engine - Teradata Database

International Character Set Support

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-23
dita:mapPath
ywb1588027283948.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1125
lifecycle
previous
Product Category
Teradata Vantageā„¢

Strings are compared character-by-character.

The comparison rules for CHARSET_COLL are:

  • If one string is shorter, it is padded with the pad character for the character set.
  • If the comparison is not case specific, lowercase characters are mapped to their uppercase counterparts.
  • If the strings are now identical, the equality relation holds. Otherwise, the first pair of characters that are not equal determine the collating sequence.
  • If both characters are in the repertoire of the current client character set, then the binary ordering of the two characters in the client form-of-use becomes the ordering of the two strings.
  • If one of the characters is not within the repertoire of the current client character set, then the error character is used as the collation point for that character.
  • If both characters being compared are outside the repertoire of the current client character set, then the binary ordering of the characters (case blind or case specific, as appropriate) in the Unicode form-of-use becomes the ordering of the two strings.
  • Kanji data
    • CHARSET_COLL is of limited use with the KANJI1 server character set.

      KANJI1 character data can contain mixed single-byte/multibyte characters. Single-byte characters are translated into the Teradata Database form-of-use and multibyte characters are not translated.

    • Single-byte characters are collated based on the current character set and multibyte characters based on their internal value.
    • For KanjiEBCDIC and KanjiShift-JIS client character sets, the collation is like a binary sort on the client.
    • For a KanjiEUC client character set, the collation is like Kanji Phase I ASCII collation.

      The distinction between this and a binary sort on the client is that the JIS X 0208 characters collate before, rather than after, the JIS X 0212 characters.