Defining Your Own Collation Sequence - Teradata Database

International Character Set Support

Product
Teradata Database
Release Number
15.10
Language
English (United States)
Last Update
2018-09-25
dita:id
B035-1125
lifecycle
previous
Product Category
Teradata® Database

To create your own collation sequence, define for each character of the alphabet the hexadecimal representation for each of the following:

  • The equivalence class
  • The casespecific value
  • The uppercase value
  • It is useful to have the hexadecimal representation for the Teradata Database internal value of the character so that the preceding items can be placed in the appropriate position within the defining byte string, as shown in “Example of Installing a Japanese Character Standard Collation” on page 118.

    Each group of values is inserted into a column in one row of DBC.CollationsV.

    Site-defined collations are valid only for single-byte characters.

    A defined collation is active for all server character sets, but they are not applied equally.

    You can change the MULTINATIONAL collation for all 256 characters in LATIN, and this definition holds true for those characters whether they occur in GRAPHIC, KANJISJIS, or UNICODE. Note that KANJISJIS and UNICODE have many more characters than LATIN and that the “extra” characters continue to follow the default multinational collation rules.

    The collation is two level, which, in the following case specific example, allows you to specify a collation that always produces the following ordering:

    1 aa

    2 Aa

    3 ab

    4 Ab

    A single-level, case blind collation might give that answer sometimes, but could theoretically also produce several other orderings, such as the following, on consecutive queries:

    1 Aa

    2 aa

    3 ab

    4 Ab

    The KANJI1 server character type is different because its collation is single level, based on the CollOrderCS and CollOrderUC fields. In other words, KANJI1 uses the second-level collation values as its only level. Because of this, it is not possible to define a collation that works well with both KANJI1 and other server character sets.

    Note also that for KANJI1, only single-byte characters are affected by the MULTINATIONAL collation definition. Multibyte characters continue to collate as always by means of binary comparison of the internal representation of those characters. This should be very similar to how those characters would collate in the client character set they were entered from, with the exception that EUC cs3 characters on the client would sort before, rather than after, cs1 characters.

    The CollInstallMulti macro takes the codes you load into the DBC.CollationsV view and inserts them into the underlying system table as the new default collation sequence.

    The :CollationName parameter you submit to the macro must match the CollName value you specified for DBC.CollationsV.

    During this process, CollInstallMulti performs the following actions.

    1 Verifies that the value of :CollationName is not MULTINATIONAL.

    2 Verifies that :CollationName is found in the CollName column of DBC.CollationTbl.

    3 Deletes the row where CollName is MULTINATIONAL.

    4 Inserts (through DBC.CollationsV) a row with the following values:

  • The string MULTINATIONAL in CollName
  • The letter Y in CollInstall
  • The codes for CollEqvClass, CollOrderCS, and CollOrderUC, which are selected from the row you loaded (where CollName should be equal to :CollationName).
  • The SQL code for CollInstallMulti is as follows:

       REPLACE MACRO CollInstallMulti (CollationName CHAR(30)) AS
       ( ABORT 'Argument cannot be MULTINATIONAL collation'
       WHERE :CollationName = 'MULTINATIONAL';
       ABORT 'Specified collation has not been defined'
       WHERE :CollationName NOT IN 
       ( SELECT CollName 
       FROM DBC.CollationsV );
       DELETE FROM DBC.CollationsV 
       WHERE CollName = 'MULTINATIONAL';
       INSERT INTO DBC.CollationsV
       SELECT 'MULTINATIONAL', 'Y', CollEqvClass,
       CollOrderCS, CollOrderUC
       FROM DBC.CollationsV
       WHERE CollName = :CollationName; );
       

    The name, attributes, and use of each column in DBC.CollationsV are defined as follows.

     

    Column Name

    Data Type

    Description

    CollName

    VARCHAR(128) NOT NULL

    Enter any legal name except MULTINATIONAL.

    When executed, the CollInstallMulti macro accepts any legal name other than MULTINATIONAL.

    CollInstall

    CHAR(1) NOT NULL

    Enter the letter N.

    When executed, CollInstallMulti inserts a Y indicating that this row is to be installed.

    CollEqvClass

    BYTE (256) BETWEEN 0x00 AND 0xFF NOT NULL

    Specify, for each character, the value of its equivalent class.

    For standard Japanese collations, each value in this field must be 0x00.

    CollOrderCS

    BYTE (256) BETWEEN 0x00 AND 0xFF NOT NULL

    Specify, for each character, the relative order of the character within its class for case-specific comparison.

    CollOrderUC

    BYTE (256) BETWEEN 0x00 AND 0xFF NOT NULL

    Specify, for each character, the relative order of the character within its class for uppercase comparison.

    For more information about DBC.CollationsV, see Data Dictionary.

    Use the DBC.CollationsV view to populate the system table with your collation codes.

    To populate the table through the view, perform the following procedure:

    1 Create a BTEQ batch file containing a CollName, a CollInstall value, and the hexadecimal definition of every character in the new collation sequence. Be sure that entries exist for every column of the DBC.CollationsV view.

    The value of CollName cannot be MULTINATIONAL. Also, the first SQL statement in the job file should be a DELETE statement to remove any existing row that might have the same CollName. For details, see “Example of Inserting and Activating a Site-Defined Collation Sequence” on page 122.

    2 Log on to the Teradata Database as user SYSADMIN or as user DBC.

    3 Use the following statement to verify the contents of database DBC:

       HELP DATABASE DBC; 

    The following objects must appear in the HELP listing for DBC to implement a site-defined collation sequence:

  • A table named CollationTbl
  • A view named CollationsV
  • The following macros:
  • CollAddStandard
  • CollInstallMulti
  • If these objects are not in the DBC database, use the DIP utility to create them.

    4 Run the BTEQ batch job you created in step 1.

    The row you insert into the DBC.CollationsV view automatically populates the DBC.CollationTbl.

    To activate the collation codes you inserted using the DBC.CollationsV view, perform the following procedure:

    1 Execute the CollInstallMulti macro, as follows:

    EXECUTE CollInstallMulti (‘collation_name’);

    where collation_name is the CollName you specified in your BTEQ job.

    2 Restart the Teradata Database (enter the restart tpa command from the Database Window of the system console) to install the codes as the new default collation sequence.

    From now on, the collation sequence you just installed is the default when the COLLATION option is set to MULTINATIONAL.

    The following example shows a BTEQ batch job that inserts into the DBC.CollationsV view a 256-character row containing the hexadecimal codes for a site-defined collation sequence, and then executes the CollInstallMulti macro to activate those codes as the new default.

    You must restart the Teradata Database to complete the installation.

    /* First delete from DBC.CollationsV any     */
    /* existing definitions and then insert     */
    /* the new definitions                      */
    DELETE FROM DBC.CollationsV 
     WHERE CollName = ‘NON_STANDARD’;
    INSERT INTO DBC.CollationsV
    ( CollName
    , CollInstall
    , CollEqvClass
    , CollOrderCS
    , CollOrderUC )
    VALUES
    ( ‘NON_STANDARD’
    , ‘N’
    /* CollEqvClass. Equivalence class.         */
    /*  0 1 2 3 4 5 6 7 8 9 A B C D E F         */ 
    , ‘000102030405060708090A0B0C0D0E0F’xb /* 0 */
    ||’101112131415161718191A1B1C1D1E1F’xb /* 1 */
    ||’202122232425262728292A2B2C2D2E2F’xb /* 2 */
    ||’303132333435363738393A3B3C3D3E3F’xb /* 3 */
    ||’406162636465666768696A6B6C6D6E6F’xb /* 4 */
    ||’707172737475767778797AB0B1B2B3B4’xb /* 5 */
    ||’B56162636465666768696A6B6C6D6E6F’xb /* 6 */
    ||’707172737475767778797AB6B7B8B9BA’xb /* 7 */
    ||’EAEBECEDBBBCBD80BFC0C1C2C3C4C5C6’xb /* 8 */
    ||’C7C8C9CACBCCCDCEEEEFF0CFD0D1D2D3’xb /* 9 */
    ||’F1D4D5D6F2D7F3D8D9DADBDCF4F5F6F7’xb /* A */
    ||’DDDEDFE0F8E1E2E3F9E4E5E6E7E8FAE9’xb /* B */
    ||’6161616161AEAB636565656569696969’xb /* C */
    ||’FC6E6F6F6F6F6F6FAC7575757579FB73’xb /* D */
    ||’6161616161AEAB636565656569696969’xb /* E */
    ||’FD6E6F6F6F6F6F6FAC7575757579FEFF’xb /* F */
    /* CollOrderCS. Case-specific ordering      */
    /*  0 1 2 3 4 5 6 7 8 9 A B C D E F         */ 
    , ‘00000000000000000000000000000000’xb /* 0 */
    ||’00000000000000000000000000000000’xb /* 1 */
    ||’00000000000000000000000000000000’xb /* 2 */
    ||’00000000000000000000000000000000’xb /* 3 */
    ||’00010101010101010101010101010101’xb /* 4 */
    ||’01010101010101010101010000000000’xb /* 5 */
    ||’00000000000000000000000000000000’xb /* 6 */
    ||’00000000000000000000000000000000’xb /* 7 */
    ||’00000000000000000000000000000000’xb /* 8 */
    ||’00000000000000000000000000000000’xb /* 9 */
    ||’00000000000000000000000000000000’xb /* A */
    ||’00000000000000000000000000000000’xb /* B */
    ||’030507090B0101030305070903050709’xb /* C */
    ||’0003030507090B0D0103050709030002’xb /* D */
    ||’020406080A0000020204060802040608’xb /* E */
    ||’0002020406080A0C0002040608020000’xb /* F */
    /* CollOrderUC. uppercase ordering          */
    /*  0 1 2 3 4 5 6 7 8 9 A B C D E F         */ 
    , ‘00000000000000000000000000000000’xb /* 0 */
    ||’00000000000000000000000000000000’xb /* 1 */
    ||’00000000000000000000000000000000’xb /* 2 */
    ||’00000000000000000000000000000000’xb /* 3 */
    ||’00010101010101010101010101010101’xb /* 4 */
    ||’01010101010101010101010000000000’xb /* 5 */
    ||’00010101010101010101010101010101’xb /* 6 */
    ||’01010101010101010101010000000000’xb /* 7 */
    ||’00000000000000000000000000000000’xb /* 8 */
    ||’00000000000000000000000000000000’xb /* 9 */
    ||’00000000000000000000000000000000’xb /* A */
    ||’00000000000000000000000000000000’xb /* B */
    ||’030507090B0101030305070903050709’xb /* C */
    ||’0003030507090B0D0503050709030001’xb /* D */
    ||’030507090B0101030305070903050709’xb /* E */
    ||’0003030507090B0D0503050709030001’xb /* F */);
    /* Now install this collation as the collation default */
       EXECUTE CollInstallMulti (‘NON_STANDARD’);

    To change the default from your defined collation back to either Swedish, Norwegian, or one of the Japanese collations, execute the CollInstallMult macro as explained in the previous passages.

    To return to the Teradata Standard Multinational collation, execute the CollAddStandard macro as explained in previous passages.

    Executing the CollAddStandard macro deletes your defined codes before reloading the codes for the Teradata Standard Multinational collation.