Collations control character ordering and comparison operations during Teradata Database sessions.
Collations are designed as single level or two level. A two-level collation orders character strings according to a two-level comparison.
Characters are first partitioned into equivalence classes that have the same collating value. The relative ordering of classes and characters within a class is significant.
Comparisons obey the following rules:
The process is as follows:
1 Convert characters in the strings to be compared into equivalence classes.
2 Compare the strings.
IF the strings are… |
THEN processing … |
not equal |
stops. |
equal |
continues. |
3 Order characters within each class using criteria defined for the collation sequence, and compare.
The MULTINATIONAL Norwegian standard collation sequence is an example of a two-level collation.
The Teradata Database offers five standard collation sequences in which data can be defined as CASESPECIFIC or NOT CASESPECIFIC. This affects how the five collation sequences collate and compare data.
The five collations, determined either by default or explicit use of the SET SESSION COLLATION statement, are:
CASESPECIFIC or NOT CASESPECIFIC can be chosen at table definition time, or specified as part of the SQL statement.
The default collation sequence is based upon the client type:
Collation sequence ordering is as follows:
The predefined multinational collation options are:
Further multinational collation options can be loaded using scripts. The database administrator can alter MULTINATIONAL collation.
When collation is set to MULTINATIONAL, the default sequence currently installed is used. This can either be one of the predefined sequences, supplied with the Teradata Database, or a sequence you have defined and installed.
You can execute predefined macros to change the default to Swedish, Norwegian, or the appropriate Japanese standard collation. You can also define and install your own collation, as explained in “Defining Your Own Collation Sequence” on page 119.
If all the items being compared or collated are determined to be NOT CASESPECIFIC, the collation works as if all characters that have an uppercase counterpart were converted to uppercase before being processed through ASCII, EBCDIC, CHARSET_COLL or JIS_COLL collation.
Collation can be set or changed several different ways:
You can use predefined macros to change the collation default to Swedish, Norwegian, or the appropriate Japanese standard collation.
Note: Katakana_Standard, Kanji5026_Standard, and Kanji5035_Standard are designed for the KANJI1 server character set and should not be used with other server character sets. Similarly, the other predefined collation options should not be used for KANJI1 data.
You can also define and install your own collation sequences, as explained in “Defining Your Own Collation Sequence” on page 119.
Use the HELP SESSION statement to display the collation currently in effect for your session.
Selected collation and character mapping tables are described in the following text
files, which are available only on CD-ROM or on the Teradata Information Products
website at
http://www.info.teradata.com/.
File Name (on CD) |
Title (on the Web) |
Description |
A6A0SUCD.txt |
ARABIC1256_6A0 to Unicode |
Maps ARABIC1256 to Unicode. |
blinddef.txt |
Multinational Case Blind Default Collation |
Defines the default for Multinational Case Blind collation. |
C1RMUNCD.txt |
TCHBIG5_1R0 Multibyte to Unicode |
Maps the multibyte character portion of TCHBIG5 to Unicode. |
C1RSUNCD.txt |
TCHBIG5_1R0 Single Byte to Unicode |
Maps the single-byte character portion of TCHBIG5 to Unicode. |
C1T0UNCD.txt |
SCHGB2312_1T0 Code Set 0 to Unicode |
Maps SCHGB2312 Code Set 0 to corresponding Latin letters of Unicode. |
C1T1UNCD.txt |
SCHGB2312_1T0 Code Set 1 to Unicode |
Maps SCHGB2312 Code Set 1 to Unicode. |
C2IMUNCD.txt |
SCHEBCDIC935_2IJ Multibyte to Unicode |
Maps the multibyte character portion of SCHEBCDIC935 to Unicode. |
C2ISUNCD.txt |
SCHEBCDIC935_2IJ Single Byte to Unicode |
Maps the single-byte character portion of SCHEBCDIC935 to Unicode. |
C3IMUNCD.txt |
TCHEBCDIC937_3IB Multibyte to Unicode |
Maps the multibyte character portion of TCHEBCDIC937 to Unicode. |
C2A0SUCD.txt |
CYRILLIC1251_2A0 to Unicode |
Maps CYRILLIC1251 to Unicode. |
blinddef.txt |
Multinational Case Blind Default Collation |
Defines the default for Multinational Case Blind collation. |
EUC1UNCD.txt |
KanjiEUC Code Set 1 to Unicode |
Maps KanjiEUC Code Set 1 characters (JIS-x0208) to their Unicode equivalents. |
EUC2UNCD.txt |
KanjiEUC Code Set 2 to Unicode |
Maps KanjiEUC Code Set 2 characters (JIS-x0201 Katakana) to their Unicode equivalents. |
EUC3UNCD.txt |
KanjiEUC Code Set 3 to Unicode |
Maps KanjiEUC Code Set 3 characters (JIS-x0212) to their Unicode equivalents. |
H1IMUNCD.txt |
HANGULEBCDIC933_1II Multibyte to Unicode |
Maps the multibyte character portion of HANGULEBCDIC933 to Unicode. |
H1ISUNCD.txt |
HANGULEBCDIC933_1II Single Byte to Unicode |
Maps the single-byte character portion of HANGULEBCDIC933 to Unicode. |
H2RMUNCD.txt |
HANGULKSC5601_2R4 Multibyte to Unicode |
Maps the multibyte character portion of HANGULKSC5601 to Unicode. |
H2RSUNCD.txt |
HANGULKSC5601_2R4 Single Byte to Unicode |
Maps the single-byte character portion of HANGULKSC5601 to Unicode. |
H5A0SUCD.txt |
HEBREW1255_5A0 to Unicode |
Maps HEBREW1255 to Unicode. |
H7R0MUCD.txt |
HANGUL949_7R0 Multibyte to Unicode |
Maps the multibyte character portion of HANGUL949 to Unicode. |
H7R0SUCD.txt |
HANGUL949_7R0 Single Byte to Unicode |
Maps the single-byte character portion of HANGUL949 to Unicode. |
JIS_COLL.txt |
JIS_COLL Case-Specific Collation |
Defines the JIS_COLL Case-Specific collation. |
JISCOLBL.txt |
JIS_COLL Case Blind Collation |
Defines the JIS_COLL Case Blind collation. |
K1S0SUCD.txt |
KANJI932_1S0 Single Byte to Unicode |
Maps KANJI932 to Unicode |
K1S0MUCD.txt |
KANJI932_1S0 Multibyte to Unicode |
Maps the multibyte character portion of KANJI932 to Unicode |
L1A0SUCD.txt |
LATIN1250_1A0 to Unicode |
Maps LATIN1250 to Unicode. |
L3A0SUCD.txt |
LATIN1252_3A0 to Unicode |
Maps LATIN1252 to Unicode |
L7A0SUCD.txt |
LATIN1254_7A0 to Unicode |
Maps LATIN1254 to Unicode. |
L8A0SUCD.txt |
LATIN1258_8A0 to Unicode |
Maps LATIN1258 to Unicode. |
multnatl.txt |
Multinational Case-Specific Default Collation |
Defines the default for Multinational Case-Specific collation. |
S6R0MUCD.txt |
SCHINESE936_6R0 Multibyte to Unicode |
Maps the multibyte character portion of SCHINESE936 to Unicode. |
S6R0SUCD.txt |
SCHINESE936_6R0 Single Byte to Unicode |
Maps SCHINESE936 to Unicode. |
SJISSJIS.txt |
KanjiSJIS to KanjiSJIS multibyte |
Maps KanjiShiftJIS to KanjiShiftJIS multibyte characters. |
SJISUNCD.txt |
KanjiSJIS to Unicode multibyte |
Maps KanjiShiftJIS characters to their multibyte Unicode equivalents. |
SOSIUNCD.txt |
KanjiEBCDIC (SO/SI) to Unicode |
Maps multibyte character portion of KanjiEBCDIC (SO/SI) to Unicode. |
T4A0SUCD.txt |
THAI874_4A0 Single Byte to Unicode |
Maps THAI874 to Unicode. |
T8R0MUCD.txt |
TCHINESE950_8R0 Multibyte to Unicode |
Maps the multibyte character portion of TCHINESE950 to Unicode. |
T8R0SUCD.txt |
TCHINESE950_8R0 Single Byte to Unicode |
Maps TCHINESE950 to Unicode. |
UNCDUNCD.txt |
Unicode to Unicode |
Lists the characters supported by the UNICODE server character set. |
UNCDVARG.txt |
Unicode to Vargraphic |
|
UNCDSJIS.txt |
Unicode to KanjiSJIS |
Maps Unicode characters to their KanjiShiftJIS multibyte equivalents. |
UNCDE123.txt |
Unicode to KanjiEUC Sets 1, 2, 3 |
Maps Unicode characters to KanjiEUC Code Set 1, 2, and 3 (JIS-x0208) as UNIX Process Code (UPC). |
UNCDSOSI.txt |
Unicode to KanjiEBCDIC (SO/SI) |
Maps Unicode characters to the multibyte character portion of KanjiEBCDIC (SO/SI). |
UOBJNSTD.txt |
Unicode in Object Names on standard language support systems |
Lists characters from the UNICODE server character set that are allowed in object names on standard language support systems. |
UOBJNJAP.txt |
Unicode in Object Names on Japanese language support systems |
Lists characters from the UNICODE server character set that are allowed in object names on Japanese language support systems. |