15.00 - INDEX - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)

INDEX

Purpose  

Returns the position in string_expression_1 where string_expression_2 starts.

Syntax  

where:

 

Syntax element …

Specifies …

string_expression_1

a full string to be searched.

string_expression_2

a substring to be searched for its position within the full string.

ANSI Compliance

This is a Teradata extension to the ANSI SQL:2011 standard.

Use POSITION instead of INDEX for ANSI SQL:2011 compliance.

Argument Types and Rules

INDEX operates on the following types of arguments:

  • Character
  • Byte
  • If one string expression is of type BYTE, then both string expressions must be of type BYTE.

  • Numeric
  • If any string expression is numeric, then it is converted implicitly to CHARACTER type.

  • UDTs that have implicit casts that cast between the UDT and any of the following predefined types:
  • Numeric
  • Character
  • DATE
  • Byte
  • To define an implicit cast for a UDT, use CREATE CAST and specify AS ASSIGNMENT. For details on CREATE CAST, see SQL Data Definition Language.

    Implicit type conversion of UDTs for system operators and functions, including INDEX, is a Teradata extension to the ANSI SQL standard. To disable this extension, set the DisableUDTImplCastForSysFuncOp field of the DBS Control Record to TRUE. For details, see Utilities: Volume 1 (A-K).

    INDEX does not support CLOBs or BLOBs.

    For more information on implicit type conversion, see Chapter 13: “Data Type Conversions.”

    Result Type and Attributes

    Here are the default result type and attributes for INDEX(arg1, arg2):

     

    Data Type

    Heading

    INTEGER

    Index(arg1, arg2)

    Expected Values

    The following rules apply to the value that INDEX returns:

  • If string_expression_2 is not found in string_expression_1, then the result is zero.
  • If string_expression_2 is null, then the result is null.
  • If the arguments are character types, INDEX returns a logical character position, not a byte position, except when the server character set of the arguments is KANJI1 and the session client character set is KanjiEBCDIC.
  • For details, see “Rules for KANJI1 Server Character Set” on page 1203.

    Rules for Character Type Arguments

    If the arguments are character types, matching is in terms of logical characters. Single byte characters are matched against single byte characters, and multibyte characters are matched against multibyte characters. For a match to occur, representation of the logical character must be identical in both expressions.

    If the server character sets of the arguments are not the same, INDEX performs an implicit character translation. For a description of implicit character translation rules, see “Implicit Character-to-Character Translation” on page 603.

    The CASESPECIFIC attribute affects whether characters are considered to be a match.

     

    IF the session mode is …

    THEN the default case specification for character columns and literals is …

    ANSI

    CASESPECIFIC.

    Teradata

    NOT CASESPECIFIC.

    The exception is character data of type GRAPHIC, which is always CASESPECIFIC.

    To override the default case specification, you can apply the CASESPECIFIC or NOT CASESPECIFIC phrase to a character column in CREATE TABLE or ALTER TABLE.

    Or, you can apply the CASESPECIFIC or NOT CASESPECIFIC phrase to the INDEX character string arguments.

     

    IF …

    THEN …

    either argument has a CASESPECIFIC attribute (either by default or specified explicitly)

    simple Latin letters are considered to be matching only if they are the same letters and the same case.

    both arguments have a NOT CASESPECIFIC attribute (either by default or specified explicitly)

    before the operation begins, some characters are converted to uppercase.

    If the character is a lowercase simple Latin letter, the character is converted to uppercase before the operation begins.

    If the character is a non-Latin single byte character, a multibyte character, or a byte indicating a transition between single-byte and multibyte character data, the character is not converted to uppercase.

    Using the rules for character type arguments, if you want INDEX to match letters only if they are the same letters in the same case, specify the CASESPECIFIC phrase with at least one of the arguments. For example:

       SELECT Name 
       FROM Employee 
       WHERE INDEX(Name, 'X' (CASESPECIFIC)) = 1;

    If you want INDEX to match letters without considering the case, specify the NOT CASESPECIFIC phrase with both of the arguments.

    Rules for KANJI1 Server Character Set

    Caution:

    In accordance with Teradata internationalization plans, KANJI1 support is deprecated and is to be discontinued in the near future. KANJI1 is not allowed as a default character set; the system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible. “”

    When the server character set is KANJI1 and the client character set is KanjiEBCDIC, the offset count includes Shift-Out/Shift-In characters, but they are not matched. They are treated only as an indication of a transition from a single byte character and an multibyte character.

    The nonzero position of the result is reported as follows:

     

    IF the character set is …

    THEN the result is the …

    KanjiEBCDIC

    position of the first byte of the logical character offset (including Shift-Out/Shift-In in the offset count) within string_expression_1.

    other than KanjiEBCDIC

    logical character offset within string_expression_1.

    Relationship Between INDEX and POSITION

    INDEX and POSITION behave identically, except on character type arguments when the client character set is KanjiEBCDIC, the server character set is KANJI1, and an argument contains a multibyte character.

    For an example of when the two functions return different results for the same data, see “How POSITION and INDEX Differ” on page 1232.

    Example  

    The following table shows examples of simple INDEX expressions and their results.

     

    Expression

    Result

    INDEX('catalog','log')

    5

    INDEX('catalog','dog')

    0

    INDEX('41424344'XB,'43'XB)

    3

    Example  

    The following examples show how INDEX(string_1, string_2) operates when the server character set for string_1 and the server character set for string_2 differ. In these cases, both arguments are converted to UNICODE (if needed) and the characters are matched logically.

     

    IF string_1 is …

    AND string_2 is …

    THEN the result is …

    Character Set

    Data

    Character Set

    Data

     

    UNICODE

    92abc

    LATIN

    abc

    4

    UNICODE

    abc

    UNICODE

    c

    3

    KANJISJIS

    9204

    UNICODE

    0

    4

    Example  

    The following examples show how INDEX(string_1, string_2) operates when the server character set for both arguments is KANJI1 and the client character set is KanjiEBCDIC.

    Note that for KanjiEBCDIC, results are returned in terms of physical units, making INDEX DB2-compliant in that environment.

     

    IF string_1 contains …

    AND string_2 contains …

    THEN the result is …

    MN<AB>

    <B>

    6

    MN<AB>

    <A>

    4

    MN<AB>P

    P

    9

    MXN<AB>P

    <B>

    7

    Example  

    The following examples show how INDEX(string_1, string_2) operates when the server character set for both arguments is KANJI1 and the client character set is KanjiEUC.

     

    IF string_1 contains …

    AND string_2 contains …

    THEN the result is …

    a b ss3A

    ss3A

    3

    a b ss2B

    ss2B

    3

    CS1_DATA

    A

    6

    a b ss2D ss3E ss2F

    ss2F

    5

    a b C ss2D ss3E ss2F

    ss2F

    6

    CS1_DmATA

    A

    7

    Example  

    The following examples show how INDEX(string_1, string_2) operates when the server character set for both arguments is KANJI1 and the client character set is KanjiShift-JIS.

     

    IF string_1 contains …

    AND string_2 contains …

    THEN the result is …

    mnABCX

    B

    4

    mnABCX

    X

    6

    Example  

    In this example, INDEX is applied to ' ' (the SPACE character) in the value strings in the Name column of the Employee table.

       SELECT name 
       FROM employee 
       WHERE INDEX(name, ' ') > 6 ; 

    INDEX examines the Name field and returns all names where a space appears in a character position beyond the sixth (character position seven or higher).

    Example  

    The following example displays a list of projects in which the word Batch appears in the project description, and lists the starting position of the word.

       SELECT proj_id, INDEX(description, 'Batch') 
       FROM project 
       WHERE INDEX(description, 'Batch') > 0 ; 

    The system returns the following report.

       proj_id       Index (description, 'Batch')
       ------------- ----------------------------
       OE2-0003                                 5
       AP2-0003                                13
       OE1-0003                                 5
       AP1-0003                                13
       AR1-0003                                10
       AR2-0003                                10

    Example  

    A somewhat more complex construction employing concatenation, SUBSTRING, and INDEX might be more instructive. Suppose the employee table contains the following values.

       empno      name       
       ---------- -----------
       10021      Smith T
       10007      Aguilar J
       10018      Russell S
       10011      Chin M
       10019      Newman P

    You can transpose the form of the names from the name column selected from the employee table and change the punctuation in the report using the following query:

       SELECT empno, 
       SUBSTRING(name FROM INDEX(name,' ')+1 FOR 1)||'. '||
       SUBSTRING(name FROM 1 FOR INDEX(name, ' ')-1) 
       (TITLE 'Emp Name') 
       FROM employee ; 

    The system returns the following report.

       empno      ­Emp Name       
       ---------- --------------
       10021      T. Smith 
       10007      J. Aguilar 
       10018      S. Russell 
       10011      M. Chin 
       10019      P. Newman