Character Shorthand Notation Used in This Document | Teradata Vantage - Character Shorthand Notation Used in This Document - Analytics Database

Character Shorthand Notation Used in This Document | Teradata Vantage - Character Shorthand Notation Used in This Document - Analytics Database - Teradata Vantage

SQL Operators and User-Defined Functions

Deployment

VantageCloud

VantageCore

Edition

Enterprise

IntelliFlex

VMware

Product

Analytics Database

Teradata Vantage

Release Number

17.20

Published

June 2022

Language

English (United States)

Last Update

2024-04-05

dita:mapPath

xub1628111590556.ditamap

dita:ditavalPath

qkf1628213546010.ditaval

dita:id

drp1544241916620

lifecycle

latest

Product Category

Teradata Vantage™

This document uses the Unicode naming convention for characters. For example, the lowercase character ‘a’ is more formally specified as either LATIN CAPITAL LETTER A or U+0041. The U+xxxx notation refers to a particular code point in the Unicode standard, where xxxx stands for the hexadecimal representation of the 16-bit value defined in the standard.

This document may use a symbol to represent a special character, or a particular class of characters, especially when discussing the following Japanese character encodings:

KanjiEBCDIC
KanjiEUC
KanjiShift-JIS

These encodings are further defined in International Character Set Support, B035-1125.

Character Symbols

The following table defines the symbols and their associated character sets.

Symbol	Encoding	Meaning
a-z A-Z 0-9	Any	Any single byte Latin letter or digit.
a-z A-Z 0-9	Any	Any fullwidth Latin letter or digit.
<	KanjiEBCDIC	Shift Out [SO] (0x0E). Indicates transition from single to multibyte character in KanjiEBCDIC.
>	KanjiEBCDIC	Shift In [SI] (0x0F). Indicates transition from multibyte to single byte KanjiEBCDIC.
T	Any	Any multibyte character. The encoding depends on the current character set. For KanjiEUC, code set 3 characters are always preceded by ss3.
I	Any	Any single byte Hankaku Katakana character. In KanjiEUC, it must be preceded by ss2, forming an individual multibyte character.
Δ	Any	Represents the graphic pad character.
Δ	Any	Represents a single or multibyte pad character, depending on context.
ss 2	KanjiEUC	Represents the EUC code set 2 introducer (0x8E).
ss 3	KanjiEUC	Represents the EUC code set 3 introducer (0x8F).

For example, string "TEST", where each letter is intended to be a fullwidth character, is written as TEST. When encoding is important, hexadecimal representation is used.

For example, the following mixed single byte/multibyte character data in KanjiEBCDIC character set:

LMN<TEST>QRS

is represented as:

D3 D4 D5 0E 42E3 42C5 42E2 42E3 0F D8 D9 E2

Pad Characters

The following table lists the pad characters for the character data types.

Server Character Set	Pad Character Name	Pad Character Value
LATIN	SPACE	0x20
UNICODE	SPACE	U+0020
GRAPHIC	IDEOGRAPHIC SPACE	U+3000
KANJISJIS	ASCII SPACE	0x20
KANJI1	ASCII SPACE	0x20