Character String Literals - Teradata Vantage NewSQL Engine - 16.20

Teradata Vantage™ Data Types and Literals

prodname
Teradata Database
Teradata Vantage NewSQL Engine
vrm_release
16.20
category
Programming Reference
featnum
B035-1143-162K

Purpose

Declares a character string literal value in an expression.

Syntax



_character_set
The name of the character set to be associated with the string literal:
  • _Latin indicates that each character in the string is within the repertoire of the LATIN character set.
  • _Unicode indicates that each character in the string is within the repertoire of the UNICODE character set.
  • _KanjiSJIS indicates that each character in the string is within the repertoire of the KANJISJIS character set
  • _Graphic indicates that each character in the string is within the repertoire of the GRAPHIC character set
If _character_set is not specified, the default is _Unicode.
Do not use an identifier of _Kanji1. This will generate an error.
string
The character string literal.

ANSI Compliance

Character literals are ANSI SQL:2011 compliant.

Definition

Character literals consist of zero or more alphanumeric characters enclosed in apostrophes. The new line character and spaces are allowed between the apostrophes. The total length of a character literal is 0 to 31000 bytes.

A character literal can include pass through characters (PTCs) for use in sessions with Unicode Pass Through enabled. Note that character literals containing PTCs cannot be used in account strings, which are validated as object names, and object names cannot contain PTCs. For information about Unicode Pass Through and PTCs, see Teradata Vantage™ NewSQL Engine International Character Set Support, B035-1125.

A zero-length character literal is represented by two consecutive apostrophes ( '' ).

To include an apostrophe ( ' ) in a character literal, use two consecutive apostrophes ( '' ).

Multiple consecutive character literals are treated as if the strings are concatenated.

Character Literal Data Type

The data type of character literals is VARCHAR(n) CHARACTER SET UNICODE, where n is the length of the literal.

For details on the VARCHAR(n) type, see VARCHAR Data Type.

Character Literal Identifiers and KANJI1 Data

The following character data type literal identifiers can be assigned to a KANJI1 character column.
  • _Latin ‘string’
  • _Unicode ‘string’
  • _Graphic ‘string’
  • _KanjiSJIS ‘string’

Conversion is based on the current session character set, which can be established using the BTEQ command SET SESSION CHARSET.

For information on the assignability of these literals, see “Data Type Conversions” in Teradata Vantage™ SQL Functions, Expressions, and Predicates, B035-1145.

KANJI1 support is deprecated. KANJI1 is not allowed as a default character set. The system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible.

Case Specification

The CASESPECIFIC attribute of character strings determines the rules for string comparisons. The default case specificity for a character string literal depends on the mode of the session parsing the string for execution.

In this mode … The default case specificity for strings is …
ANSI CASESPECIFIC
Teradata NOT CASESPECIFIC

The exception is character data of type CHARACTER or VARCHAR CHARACTER SET GRAPHIC, which is always CASESPECIFIC.

Example: Simple Character Strings

The following are examples of valid character literals:

'Los Angeles'
''

Example: Character Strings Containing an apostrophe

To enter an apostrophe in a character literal, double the embedded apostrophe.

'He said ''yes'' to her question'

Example: Character Literals Composed of Segments

Multiple strings are treated as though they are concatenated into one character literal.

('AA' 'BB') is ('AA' || 'BB')
(_Unicode 'AA' _Latin 'BB') is (_Unicode 'AA' || _Latin 'BB')

Example: Character Set Specification

To indicate that the characters in a literal are within the repertoire of the LATIN character set, use the _Latin identifier.

CREATE TABLE table1(Name CHAR(30));
INSERT INTO table1 (_Latin 'Sandoval');

Note that the _Latin identifier does not change the server character set of the character literal data type, which is always UNICODE.

SELECT TYPE (_Latin 'Sandoval');

returns:

Type('Sandoval')
---------------------------------
VARCHAR(8) CHARACTER SET UNICODE;

Example: Converting the Server Character Set of a Character Literal

The data type and server character set of a character literal is always VARCHAR(n) CHARACTER SET UNICODE, where n is the length of the literal.

Certain situations may require you to convert a character literal from UNICODE to a different server character set.

For example, suppose you want to compare the hexadecimal representation of two character strings. To get the hexadecimal representation of a character string, you can use CHAR2HEXINT.

Suppose one of the character strings you want to compare is the character literal 'a'. The following statement returns the hexadecimal representation of 'a':

SELECT CHAR2HEXINT('a');

The result is '0061', the hexadecimal representation of 'a' in the UNICODE server character set, which uses two bytes per character.

Suppose the other character string you want to compare is in a column defined as CHAR(1) CHARACTER SET LATIN:

CREATE TABLE table1
  (column1 CHAR(1) CHARACTER SET LATIN
  ,column2 INTEGER);
INSERT INTO table1 ('a', 1001);

The following statement returns the hexadecimal representation of the string in column1:

SELECT CHAR2HEXINT(column1);

The result is '61', the hexadecimal representation of 'a' in the LATIN server character set, which uses one byte per character.

The following query compares the two strings:

SELECT column2 FROM table1
WHERE CHAR2HEXINT('a') = CHAR2HEXINT(column1);

The result is an empty set, because the result of CHAR2HEXINT('a') is '0061' and the result of CHAR2HEXINT(column1) is '61'.

For situations such as this, you can use TRANSLATE to convert the server character set of one of the character strings to match the server character set of the other character string. For example:

SELECT column2 FROM table1
WHERE CHAR2HEXINT( TRANSLATE('a' USING UNICODE_TO_LATIN) ) =
CHAR2HEXINT(column1);

returns:

    column2
-----------
       1001