15.00 - SUBSTRING/SUBSTR - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)

SUBSTRING/SUBSTR

Purpose  

Extracts a substring from a named string based on position.

ANSI Syntax

where:

 

Syntax Element …

Specifies …

string_expression

a string expression from which the substring is to be extracted.

n1

the starting position of the substring to extract from string_expression.

FOR

a keyword indicating that the searched substring is bounded on the right by the value n2.

If you omit FOR n2, then you extract the entire right hand portion of the named string or string expression, beginning at the position named by n1.

If string_expression is a BYTE or CHAR type and you omit FOR n2, trailing binary zeros or pad characters are trimmed.

n2

the length of the substring to extract from string_expression.

If n2 < 0, the function returns an error.

Teradata Syntax

where:

 

Syntax Element …

Specifies …

string_expression

a string expression from which the substring is to be extracted.

n1

the starting position of the substring to extract from string_expression.

n2

the length of the substring to be extracted from string_expression.

If string_expression is a BYTE or CHAR type and you omit n2, trailing binary zeros or pad characters are trimmed.

If n2 < 0, the function returns an error.

ANSI Compliance

This is ANSI SQL:2011 compliant.

SUBSTR is a Teradata extension to the ANSI SQL:2011 standard.

Argument Types and Rules

SUBSTRING and SUBSTR operate on the following types of arguments:

  • Character
  • Byte
  • Numeric
  • If the string_expression argument is numeric, it is implicitly converted to CHARACTER type.

  • UDTs that have implicit casts to any of the following predefined types:
  • Character
  • Numeric
  • Byte
  • DATE
  • To define an implicit cast for a UDT, use the CREATE CAST statement and specify the AS ASSIGNMENT clause. For more information on CREATE CAST, see SQL Data Definition Language.

    Implicit type conversion of UDTs for system operators and functions, including SUBSTRING and SUBSTR, is a Teradata extension to the ANSI SQL standard. To disable this extension, set the DisableUDTImplCastForSysFuncOp field of the DBS Control Record to TRUE. For details, see Utilities: Volume 1 (A-K).

    For more information on implicit type conversion, see Chapter 13: “Data Type Conversions.”

    Result Type and Attributes

    Here are the default result type and attributes for SUBSTR(string, n1, n2) and SUBSTRING(string FROM n1 FOR n2):

    If the string argument is a:

  • BLOB, the result type is BLOB(n).
  • byte string other than BLOB, the result type is VARBYTE(n).
  • CLOB, the result type is CLOB(n).
  • numeric or character string other than CLOB, the result type is VARCHAR(n).
  • In ANSI mode, the value of n for the resulting BLOB(n), VARBYTE(n), CLOB(n), or VARCHAR(n) is the same as the original string. In Teradata mode, the value of n for the result type depends on the number of characters or bytes in the resulting string. To get the data type of the resulting string, use the TYPE function.

    Result Value

    SUBSTRING/SUBSTR extracts n2 characters or bytes from string_expression starting at position n1.

    To get the number of characters or bytes in the resulting string, use the BYTE function for byte strings and the CHARACTER_LENGTH function for character strings.

    If either of the following conditions are true, SUBSTRING/SUBSTR returns a zero length string:

  • (n1 > string_length) AND (0 n2)
  • (n1 < 1) AND (0 n2) AND ((n2 + n1 - 1) 0)
  • Usage Rules for SUBSTRING and SUBSTR

    SUBSTRING is the ANSI SQL:2011 syntax. Teradata syntax using SUBSTR is supported for backward compatibility. Use SUBSTRING in place of SUBSTR for ANSI compliance.

    Use SUBSTRING in place of MSUBSTR. (MSUBSTR no longer appears in this book because its use is deprecated and it will not be supported after support for KANJI1 is dropped.)

    Caution:

    In accordance with Teradata internationalization plans, KANJI1 support is deprecated and is to be discontinued in the near future. KANJI1 is not allowed as a default character set; the system changes the KANJI1 default character set to the UNICODE character set. Creation of new KANJI1 objects is highly restricted. Although many KANJI1 queries and applications may continue to operate, sites using KANJI1 should convert to another character set as soon as possible. “”

    Difference Between SUBSTRING and SUBSTR

    SUBSTRING and SUBSTR perform identically except when they operate on character strings in Teradata mode where the server character set is KANJI1 and the client character set is KanjiEBCDIC.

    In this case, SUBSTR interprets n1 and n2 as physical units, making the DB2‑compliant SUBSTR operate on a byte-by-byte basis. Shift-Out and Shift-In bytes are significant because the result might be formatted incorrectly. For example, the result string might not contain either the opening Shift-Out character or the closing Shift-In character.

    Otherwise, if string_expression is character data, then SUBSTRING expects mixed single byte and multibyte character strings and operates on logical characters that are valid for the character set of the session. In this case, n1 is a positive integer pointing to the first character of the result and n2 is in terms of logical characters.

    Example  

    Suppose sn is a CHARACTER(15) field of Serial IDs for Automobiles and positions 3 to 5 represent the country of origin as three letters.

    For example:

       12JAP3764-35421
       37USA9873-26189
       11KOR1221-13145

    To search for serial IDs of cars made in the USA:

       SELECT make, sn 
       FROM autos 
       WHERE SUBSTRING (sn FROM 3 FOR 3) = 'USA';

    Example  

    If we want the last five characters of the serial ID, which represent manufacturing sequence number, another substring can be accessed.

       SELECT make, SUBSTRING (sn FROM 11) AS sequence 
       FROM autos
       WHERE SUBSTRING (sn FROM 3 FOR 3) = 'USA';

    Example  

    Suppose nameaddress is a VARCHAR(120) field, and the application used positions 1 to 30 for name, starting address at position 31. To return address only, but limit the number of characters returned to 50 use:

       ...
       SUBSTRING (nameaddress FROM 31 FOR 50)

    This returns an address of up to 50 characters.

    Example  

    The following example shows a SELECT statement requesting substrings from a character field in positions 1 through 4 for every row:

       SELECT SUBSTRING (jobtitle FROM 1 FOR 4) 
       FROM employee ;

    The result is as follows.

       Substring(jobtitle From 1 For 4)
       --------------------------------
       Tech 
       Cont 
       Sale 
       Secr 
       Test
        ...

    Example  

    Consider the following table:

       CREATE TABLE cstr 
         (c1 CHAR(3) CHARACTER SET LATIN
         ,c2 CHAR(10) CHARACTER SET KANJI1);
     
       INSERT cstr ('abc', '92abc');
       

    Here are some examples of how to use SUBSTR to extract substrings from the KanjiEUC client character set.

     

    Function

    Result

    SELECT SUBSTR(c2, 2, 3) FROM cstr;

    '2a'

    SELECT SUBSTR(c1, 2, 2) FROM cstr;

    'bc'

    Example  

    Consider the following table:

       CREATE TABLE ctable1
         (c1 VARCHAR(11) CHARACTER SET KANJI1);

    The following table shows the difference between SUBSTR and SUBSTRING in Teradata mode for KANJI1 strings from KanjiEBCDIC client character set.

     

    IF c1 contains …

    THEN this query …

    Returns …

    MN<ABC>P

    SELECT SUBSTR(c1,2) FROM ctable1;

    N<ABC>P

    SELECT SUBSTR(c1,3,8) FROM ctable1;

    <ABC>

    SELECT SUBSTR(c1,4) FROM ctable1;

    ABC>P

    Note: The client application might not be able to properly interpret the resulting multibyte characters because the shift out (<) is missing.

    SELECT SUBSTRING(c1 FROM 2) 
    FROM ctable1;

    N<ABC>P

    SELECT SUBSTRING(c1 FROM 3 FOR 8) 
    FROM ctable1;

    <ABC>P

    SELECT SUBSTRING(c1 FROM 4) 
    FROM ctable1;

    <BC>P

    Example  

    The following table shows examples for the KanjiEUC client character set, where ctable1 is the table defined in Example 6.

     

    IF c1 contains …

    THEN this query …

    Returns …

    A ss2B CD

    SELECT SUBSTR(c1,2) FROM ctable1;

    ss2B CD

    ss3A ss2B ss3C ss2D

    SELECT SUBSTR(c1,2,2) FROM ctable1;

    ss2B ss3C

    Example  

    The following table shows examples for KanjiShift-JIS client character set, where ctable1 is the table defined in Example 6.

     

    IF c1 contains …

    THEN this query …

    Returns …

    mnABCX

    SELECT SUBSTR(c1, 6, 1) FROM ctable1;

    X

    SELECT SUBSTR(c1,4) FROM ctable1;

    BCX

    Example  

    The following statement applies the SUBSTRING function to a CLOB column in table full_text and stores the result in a CLOB column in table sub_text.

       INSERT sub_text (text)
       SELECT SUBSTRING (text FROM 9 FOR 128000)
       FROM full_text;