Declares a Unicode delimited character literal value in an expression.
- Server Character Set Specification
- An optional specification of the server character set which can be one of the following:
- A string of zero or more characters, enclosed in apostrophes, consisting of a combination of the following:
- Any character other than an apostrophe ( ' ) or the Unicode escape character
- Two consecutive apostrophes
Use two consecutive apostrophes to include an apostrophe character in the Unicode delimited character literal.
- Two consecutive Unicode escape characters
Use two consecutive escape characters to include the escape character in the delimited character literal.
- One Unicode escape character followed by 4 hexadecimal digits, where a hexadecimal digit is a character from 0 to 9, a to f, or A to F
The 4 hexadecimal digits represent a Unicode BMP (Basic Multilingual Plane) character. Leading zeroes are required.
- One Unicode escape character followed by a plus sign (+) followed by 6 hexadecimal digits, where a hexadecimal digit is a character from 0 to 9, a to f, or A to F
The 6 hexadecimal digits represent a Unicode BMP character or a supplementary character. Leading zeroes are required.
- The Unicode escape character is specified by the UESCAPE clause.
- If 'Unicode_string_body' immediately follows the U& key letters, no pad characters can appear before the first apostrophe.
- Use successive occurrences of 'Unicode_string_body' to concatenate multiple strings, separated by a SPACE (U+0020).
- UESCAPE 'Unicode_esc_char'
- A single character from the session character set to use as the Unicode escape character in the delimited character literal.
- The character must be within the printable ASCII range of Unicode characters (U+0021 through U+007E), with the following exceptions:
- SPACE (U+0020)
- QUOTATION MARK (U+0022)
- APOSTROPHE (U+0027)
- PLUS SIGN (U+002B)
- hexadecimal digit (0-9, a-f, or A-F):
- U+0030 to U+0039
- U+0041 to U+0046
- U+0061 to U+0066
- You can also specify the YEN SIGN (U+00A5) and WON SIGN (U+20A9) as the Unicode escape character.
Unicode delimited character literals are partially ANSI SQL:2011 compliant. The ANSI SQL:2011 standard does not require the UESCAPE clause and allows more possibilities for the Unicode_esc_char.
A Unicode delimited character literal can consist of a maximum of 31000 Unicode_string_body characters.
The data type of Unicode delimited character literals is VARCHAR(n) CHARACTER SET UNICODE, where n is the resolved length of the literal in Unicode characters.
For details on the VARCHAR(n) type, see VARCHAR Data Type.
A Unicode delimited character literal is useful for inserting a character string containing characters that cannot generally be entered directly on the terminal keyboard or is not available in the current session character set.
In a session where Unicode Pass Through is enabled, noncharacters in a Unicode delimited character literal are converted to an FFFD replacement character. In sessions where Unicode Pass Through is not enabled, an error is returned. Noncharacters are Unicode code points that are permanently reserved for internal use.
SELECT U&'#FFFE' UESCAPE '#'; SELECT U&'#+01FFFF' UESCAPE '#';
Surrogate pairs and inappropriate surrogates in Unicode delimited character literals result in an error whether Unicode Pass Through is enabled or not.
SELECT U&'#D800#DC00' UESCAPE '#'; SELECT U&'#+00D800#+00DC00' UESCAPE '#'; SELECT U&'#+00D800#+00D801' UESCAPE '#'; SELECT U&'#+00D800' UESCAPE '#';
.set session charset 'ascii' set session character set unicode pass through on; sel u&'#+010000' uescape '#';
.set session charset 'utf8' set session character set unicode pass through off; sel u&'#+010000' uescape '#';
For more information about Unicode Pass Through, see Teradata Vantage™ NewSQL Engine International Character Set Support, B035-1125.
If your application is intended to be ANSI-compliant and portable, you can replace existing hexadecimal character literals of the form 'hexadecimal digits'XC with Unicode character literals.
Examples: Unicode Delimited Character Literal
Consider this table:
CREATE TABLE TextTable (IDNum INTEGER ,Ustring VARCHAR(10) CHARACTER SET UNICODE);
This statement uses a Unicode delimited character literal to insert the character string into the Ustring column, using the number sign (#) as the Unicode escape character:
INSERT TextTable (10, _Unicode U&'#8CC7#6599#5009#5132' UESCAPE '#');
This statement concatenates two strings to insert '855-34-9729' into the Ustring column:
INSERT TextTable (11, U&'855-34-' '9729' UESCAPE '%');
This statement selects all rows from the TextTable table where the string in the Ustring column is an empty string:
SELECT * FROM TextTable WHERE Ustring = U&'' UESCAPE '&';
Examples: Literals Containing Unicode BMP and Supplementary Characters
You can specify Unicode BMP characters using either the 4 or 6 digit syntax. Note that leading zeroes are required.
SELECT U&'#0041' UESCAPE '#' SELECT U&'#+000041' UESCAPE '#'
You can specify Unicode supplementary characters using the 6 digit syntax. Note that leading zeroes are required.
SELECT U&'#+020000' UESCAPE '#' SELECT U&'#+010000' UESCAPE '#'
Example: Error: Literal Containing Surrogate Code Points
SELECT U&'#+010000' UESCAPE '#'
SELECT CHAR2HEXINT(U&'#D800#DC00' UESCAPE '#');