17.20 - Comparison Rules - Analytics Database - Teradata Vantage

Teradata Vantageā„¢ - Analytics Database International Character Set Support - 17.20

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2023-01-27
dita:mapPath
aju1628095815656.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
hqj1472245413611

Strings are compared character-by-character.

The comparison rules for CHARSET_COLL are:

  • If one string is shorter, it is padded with the pad character for the character set.
  • If the comparison is not case specific, lowercase characters are mapped to their uppercase counterparts.
  • If the strings are now identical, the equality relation holds. Otherwise, the first pair of characters that are not equal determine the collating sequence.
  • If both characters are in the repertoire of the current client character set, then the binary ordering of the two characters in the client form-of-use becomes the ordering of the two strings.
  • If one of the characters is not within the repertoire of the current client character set, then the error character is used as the collation point for that character.
  • If both characters being compared are outside the repertoire of the current client character set, then the binary ordering of the characters (case blind or case specific, as appropriate) in the Unicode form-of-use becomes the ordering of the two strings.
  • Kanji data
    • CHARSET_COLL is of limited use with the KANJI1 server character set.

      KANJI1 character data can contain mixed single-byte/multibyte characters. Single-byte characters are translated into the Vantage form-of-use and multibyte characters are not translated.

    • Single-byte characters are collated based on the current character set and multibyte characters based on their internal value.
    • For KanjiEBCDIC and KanjiShift-JIS client character sets, the collation is like a binary sort on the client.
    • For a KanjiEUC client character set, the collation is like Kanji Phase I ASCII collation.

      The distinction between this and a binary sort on the client is that the JIS X 0208 characters collate before, rather than after, the JIS X 0212 characters.