Declares a Unicode delimited character literal value in an expression.
The data type of Unicode delimited character literals is VARCHAR(n) CHARACTER SET UNICODE, where n is the resolved length of the literal in Unicode characters.
For details on the VARCHAR(n) type, see VARCHAR Data Type.
ANSI Compliance
Unicode delimited character literals are partially ANSI SQL:2011 compliant. The ANSI SQL:2011 standard does not require the UESCAPE clause and allows more possibilities for the Unicode_esc_char.
Syntax
[ _Latin | _Unicode | _KanjiSJIS | _Graphic ] U& 'Unicode_string_body' [ 'Unicode_string_body' [...] ] UESCAPE 'Unicode_esc_char'
Syntax Elements
- 'Unicode_string_body'
- A string of zero or more characters, enclosed in apostrophes, consisting of a combination of the following:
- Any character other than an apostrophe ( ' ) or the Unicode escape character
- Two consecutive apostrophes
Use two consecutive apostrophes to include an apostrophe character in the Unicode delimited character literal.
- Two consecutive Unicode escape characters
Use two consecutive escape characters to include the escape character in the delimited character literal.
- One Unicode escape character followed by 4 hexadecimal digits, where a hexadecimal digit is a character from 0 to 9, a to f, or A to F
The 4 hexadecimal digits represent a Unicode BMP (Basic Multilingual Plane) character. Leading zeroes are required.
- One Unicode escape character followed by a plus sign (+) followed by 6 hexadecimal digits, where a hexadecimal digit is a character from 0 to 9, a to f, or A to F
The 6 hexadecimal digits represent a Unicode BMP character or a supplementary character. Leading zeroes are required.
- Unicode_esc_char
- A single character from the session character set to use as the Unicode escape character in the delimited character literal.
Usage Notes
A Unicode delimited character literal is useful for inserting a character string containing characters that cannot be entered directly on the terminal keyboard or is not available in the current session character set.
In a session where Unicode Pass Through is enabled, noncharacters in a Unicode delimited character literal are converted to an FFFD replacement character. In sessions where Unicode Pass Through is not enabled, an error is returned. Noncharacters are Unicode code points that are permanently reserved for internal use.
SELECT U&'#FFFE' UESCAPE '#'; SELECT U&'#+01FFFF' UESCAPE '#';
Surrogate pairs and inappropriate surrogates in Unicode delimited character literals cause an error whether Unicode Pass Through is enabled or not.
SELECT U&'#D800#DC00' UESCAPE '#'; SELECT U&'#+00D800#+00DC00' UESCAPE '#'; SELECT U&'#+00D800#+00D801' UESCAPE '#'; SELECT U&'#+00D800' UESCAPE '#';
.set session charset 'ascii' set session character set unicode pass through on; sel u&'#+010000' uescape '#';
.set session charset 'utf8' set session character set unicode pass through off; sel u&'#+010000' uescape '#';
See UNICODE PASS THROUGH.
If your application is intended to be ANSI-compliant and portable, you can replace existing hexadecimal character literals of the form 'hexadecimal digits'XC with Unicode character literals.
Examples: Unicode Delimited Character Literal
Consider this table:
CREATE TABLE TextTable (IDNum INTEGER ,Ustring VARCHAR(10) CHARACTER SET UNICODE);
This statement uses a Unicode delimited character literal to insert the character string into the Ustring column, using the number sign (#) as the Unicode escape character:
INSERT TextTable (10, _Unicode U&'#8CC7#6599#5009#5132' UESCAPE '#');
This statement concatenates two strings to insert '855-34-9729' into the Ustring column:
INSERT TextTable (11, U&'855-34-' '9729' UESCAPE '%');
This statement selects all rows from the TextTable table where the string in the Ustring column is an empty string:
SELECT * FROM TextTable WHERE Ustring = U&'' UESCAPE '&';
Examples: Literals Containing Unicode BMP and Supplementary Characters
You can specify Unicode BMP characters using either the 4 or 6 digit syntax. Leading zeroes are required.
SELECT U&'#0041' UESCAPE '#' SELECT U&'#+000041' UESCAPE '#'
You can specify Unicode supplementary characters using the 6 digit syntax. Leading zeroes are required.
SELECT U&'#+020000' UESCAPE '#' SELECT U&'#+010000' UESCAPE '#'
Example: Error: Literal Containing Surrogate Code Points
SELECT U&'#+010000' UESCAPE '#'
This character requires two 16 bit surrogate code points in internal UNICODE format (which is based on UTF-16). The query:
SELECT CHAR2HEXINT(U&'#+010000' UESCAPE '#');
'D800DC00'
However, the following query returns an error because surrogate code points are not allowed in a Unicode delimited character literal:
SELECT CHAR2HEXINT(U&'#D800#DC00' UESCAPE '#');