Character String Comparisons - Teradata Database

SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Language
English (United States)
Last Update
2018-09-24
dita:id
B035-1145
lifecycle
previous
Product Category
Teradata® Database

Character String Comparisons

Comparison of Character Strings of Unequal Length

If character strings of unequal length are being compared, the shorter of the two is padded on the right with pad characters before the comparison occurs.

Character Strings and Server Character Sets

When comparing character strings, data characters must have the same server character set. If they do not, then the system translates them using the implicit translation rules described in “Implicit Character-to-Character Conversion” on page 601.

Effect of Collation on Character String Comparisons

Collations control character ordering. The results of character comparisons depends on the collation sequence of the character set in use.

You can set the default collation to a sequence that is compatible with the character set for your session. Use the HELP SESSION SQL statement to determine the collation setting for your current session.

The availability of diacritical or Japanese character sets, and your default collation sequence are under the control of your database administrator.

To ensure that sorting and comparison of character data are identical with the same operations performed by the client, users on a Japanese language site should set collation to CHARSET_COLL.

For collation details, see:

  • “SET SESSION COLLATION” in SQL Data Definition Language
  • International Character Set Support
  • “ORDER BY Clause” in SQL Data Manipulation Language
  • Case Sensitivity

    All character data, except for CLOBs, accessed in the execution of a Teradata SQL statement has an attribute of CASESPECIFIC or NOT CASESPECIFIC, either by default or by explicit designation. Character string comparisons use this attribute to determine whether the comparison is case blind or case specific. Case specificity does not apply to CLOBs.

    This is not an ANSI SQL:2011 compatible attribute—ANSI does all character comparisons as the equivalent of CASESPECIFIC.

    The CASESPECIFIC attribute has higher precedence over the NOT CASESPECIFC attribute:

     

    IF …

    THEN the comparison is …

    either argument is CASESPECIFIC

    case specific.

    both arguments are NOT CASESPECIFIC

    case blind.

    The exception is comparisons on GRAPHIC character data, which are always CASESPECIFIC.

    To apply a case specification attribute to a character string, you can:

  • Use the default case specification for the session.
  •  

    IF the session mode is …

    THEN the default case specification is …

    ANSI

    CASESPECIFIC.

    Teradata

    NOT CASESPECIFIC.

    The exception is character data of type GRAPHIC, which is always CASESPECIFIC.

    Default case specification applies to all character data, including literals.

  • Use the CASESPECIFIC or NOT CASESPECIFIC phrase with a character column in a CREATE TABLE or ALTER TABLE statement.
  • For example:

    CREATE TABLE Students
      (StudentID INTEGER
      ,Firstname CHAR(10) CASESPECIFIC
      ,Lastname CHAR(20) NOT CASESPECIFIC);

    Table columns carry the attribute assigned at the time the columns were defined or altered unless a CASESPECIFIC or NOT CASESPECIFIC phrase is used in their access.

  • Apply the CASESPECIFIC or NOT CASESPECIFIC phrase to a character expression in the comparison.
  • For example, the following statement applies the CASESPECIFIC phrase to a character literal:

    SELECT * 
    FROM Students
    WHERE Firstname = 'Ike' (CASESPECIFIC);

    Use this to override the default case specification for character data, or to override the case specification attribute assigned at the time a character column was defined or altered.

    For case blind comparisons, any lowercase single byte Latin letters are converted to uppercase before comparison begins. The prepared strings are compared and any trailing pad characters are ignored.

    A case blind comparison always considers lowercase and uppercase Cyrillic, Greek and full-width ASCII letters to be equivalent. To distinguish lowercase and uppercase Cyrillic, Greek, and fullwidth ASCII letters you must explicitly declare CASESPECIFIC comparison.

    These options work for the KANJISJIS character set as if the data were first converted to the Unicode type and then the options applied.

    Using UPPER for Case Blind Comparisons

    Case blind comparisons can be accomplished using the UPPER function, to make sure a character string value contains no lowercase Latin letters.

    The UPPER function is not the same as declaring a value UPPERCASE.

    For a description of the UPPER function, see “UPPER” on page 1274.

    Example  

    Consider the following query:

       SELECT * 
       FROM STUDENTS 
       WHERE Firstname = 'George';

    The behavior of the comparison Firstname = 'George' under different case specification attributes and session modes is described in the table that follows.

     

    IF column Firstname is …

    THEN …

    CASESPECIFIC

  • If the session mode is ANSI, then ‘George’ is CASESPECIFIC and the match succeeds for rows with Firstname containing ‘George’.
  • If the session mode is Teradata, then “George’ is NOT CASESPECIFIC and the match succeeds for rows with Firstname containing ‘George’.
  • When either character sting is CASESPECIFIC, the comparison is case specific.

    NOT CASESPECIFIC

  • If the session mode is ANSI, then ‘George’ is CASESPECIFIC and the match succeeds for rows with Firstname containing ‘George’. When either character string is CASESPECIFIC, the comparison is case specific.
  • If the session mode is Teradata, then ‘George’ is NOT CASESPECIFIC and the match succeeds for rows with Firstname containing any combination of cases that spell the name George, such as ‘george’ or ‘GEORGE’ or ‘George’. When both character strings are NOT CASESPECIFIC, the comparison is case blind.