15.00 - REGEXP_REPLACE - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)
Last Update
2018-09-24

REGEXP_REPLACE

Purpose  

Replaces portions of source_string that match regexp_string with the replace_string.

REGEXP_REPLACE supports 2, 3, 4, 5, or 6 parameters.

Syntax  

where:

 

Syntax element …

Specifies …

TD_SYSFNLIB

the name of the database where the function is located.

source_string

a character argument.

If source_string is NULL, NULL is returned.

regexp_string

a character argument.

If regexp_string is NULL, NULL is returned.

replace_string

a character argument.

If replace_string is not specified, is NULL or is an empty string, the matches are removed from the result.

position_arg

a numeric argument.

position_arg specifies the position in source_string from which to start searching (default is 1).

If position_arg is greater than the input string length, nothing is replaced and source_string is returned.

If position_arg is NULL, the NULL value is used. If position_arg is not specified, the default value (1) is used.

occurrence_arg

a numeric argument.

occurrence_arg specifies the number of the occurrence to be returned (default is 0). For example, if occurrence_arg is 2, the function matches the first occurrence in source_string and starts searching from the character following the first occurrence in source_string for the second occurrence in source_string.

If occurrence_arg is greater than the number of matches found, NULL is returned.

If occurrence_arg is NULL, a NULL result is returned. If occurrence_arg is omitted, the default value (1) is used. This is different from the Oracle implementation of REGEXP_REPLACE, where if occurrence_arg is omitted, the default value (0) is used.

match_arg

a character argument.

Valid values are:

'i' = case-insensitive matching.

'c' = case sensitive matching.

'n' = the period character (match any character) can match the newline character.

'm' = source_string is treated as multiple lines instead of as a single line. With this option, the '^' and '$' characters apply to each line in source_string instead of the entire source_string.

'l' = if source_string exceeds the current maximum allowed source string size (currently 16 MB), a NULL is returned instead of an error. This is useful for long-running queries where you do not want long strings causing an error that would make the query fail.

'x' = ignore whitespace.

The argument can contain more than one character. If a character in the argument is not valid, then that character is ignored.

If the match_arg is not specified, is NULL, or is empty:

  • The match is case sensitive.
  • A period does not match the newline character.
  • source_string is treated as a single line.
  • ANSI Compliance

    This is a Teradata extension to the ANSI SQL:2011 standard.

    Invocation

    REGEXP_REPLACE is an embedded services system function. For information on activating and invoking embedded services functions, see “Embedded Services System Functions” on page 24.

    Argument Types and Rules

    Expressions passed to this function must have the following data types:

  • source_string = CHAR, VARCHAR, or CLOB
  • regexp_string = CHAR or VARCHAR (maximum size of 512 bytes)
  • replace_string = CHAR, VARCHAR CLOB (truncated to 32 KB)
  • position_arg = NUMBER
  • occurrence_arg = NUMBER
  • match_arg = VARCHAR
  • source_string parameters that are CLOBs can be a maximum of 16 MB. The function returns an error if this size is exceeded unless match_arg = 'l', in which case, it returned NULL.

    You can also pass arguments with data types that can be converted to the above types using the implicit data type conversion rules that apply to UDFs.

    Note: The UDF implicit type conversion rules are more restrictive than the implicit type conversion rules normally used by Teradata Database. If an argument cannot be converted to the required data type following the UDF implicit conversion rules, it must be explicitly cast.

    For details, see “Compatible Types” in SQL External Routine Programming.

    Result Type

    REGEXP_REPLACE is a scalar function whose return value data type depends on the data type associated with source_string input parameter that is passed into the function.

    A source_string of:

  • CHAR, VARCHAR returns VARCHAR in the same character set as source_string.
  • CLOB returns CLOB in the same character set as source_string.
  • Example  

    The following query:

    SELECT REGEXP_REPLACE('Hello World World', '(world)$', 'My', 1, 1, 'i'); 

    returns the result "Hello World My".

    Example  

    The following query:

    SELECT REGEXP_REPLACE('Friday is the best day of the week.', 'of the week', 'EVER', 1, 1, 'c'); 

    returns the result 'Friday is the best day EVER'.

    Example  

    The following query:

    SELECT REGEXP_REPLACE('Hello Santa says ho ho', 'ho', 'HO!', 1, 2, 'c'); 

    returns the result 'Hello Santa says ho HO!'.