Teradata Package for Python Function Reference | 20.00 - regexp_replace - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference - 20.00
- Deployment
- VantageCloud
- VantageCore
- Edition
- Enterprise
- IntelliFlex
- VMware
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Enterprise_2000
- lifecycle
- latest
- Product Category
- Teradata Vantage
- teradataml.dataframe.sql.DataFrameColumn.regexp_replace = regexp_replace(regexp_string, replace_string, position, occurrence, match)
- DESCRIPTION:
Function replaces portions of string value in column, that match "regexp_string" with
the "replace_string".
PARAMETERS:
regexp_string:
Required Argument.
Specifies a ColumnExpression of a string column or a string literal
to use for regex matching.
Note:
1. If regexp_string is NULL, NULL is returned.
Format for the argument: '<dataframe>.<dataframe_column>'.
Supported column types: CHAR, VARCHAR
Types: ColumnExpression, str
replace_string:
Required Argument.
Specifies a ColumnExpression of a string column or a string literal
to use as replacement.
Note:
1. If a replace_string is not specified, is NULL or is an empty string,
the matches are removed from the result.
Format for the argument: '<dataframe>.<dataframe_column>'.
Supported column types: CHAR, VARCHAR
Types: ColumnExpression, str
position:
Optional Argument.
Specifies the position in string value in column from which to start searching.
Note:
1. If the value greater than the input string length, NULL is returned.
2. If the value is NULL, the value NULL is returned.
Types: ColumnExpression, int
Default Value: 1
occurrence:
Optional Argument.
Specifies the occurrence to replace the match with replace_string.
Notes:
1. If a value 0 is specified, all occurrences are replaced.
2. If the value is greater than 1, the search begins for the second
occurrence beginning with the first character following the first
occurrence of the regexp_string, and so on.
3. If occurrence_arg is greater than the number of matches found, nothing
is replaced and string value in column is returned.
4. If occurrence_arg is NULL, a NULL result is returned.
Types: ColumnExpression, int
Default Value: 0
match:
Optional Argument.
Specifies a character which decides the handling of regex matching.
Notes:
1. If a character in the argument is not valid, then that character is ignored.
2. If match_arg is not specified, is NULL, or is empty:
a. The match is case-sensitive.
b. A period does not match the newline character.
c. string value in column is treated as a single line.
3. The argument can contain more than one character.
Permitted values:
* 'i' - case-insensitive matching.
* 'c' - case sensitive matching.
* 'n' - the period character (match any character) can match the newline character.
* 'm' - string value in column is treated as multiple lines instead of as a single line.
With this option, the '^' and '$' characters apply to each line in string value in column
instead of the entire string value in column.
* 'l' - if string value in column exceeds the current maximum allowed size
(currently 16 MB), a NULL is returned instead of an error. This is useful for
long-running queries where you do not want long strings causing an error that
would make the query fail.
* 'x' - ignore whitespace.
Types: str
RAISES:
TypeError, ValueError, TeradataMlException
RETURNS:
DataFrameColumn
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
# Create a DataFrame on 'admissions_train' table.
>>> df = DataFrame("admissions_train")
>>> print(df)
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
# Example1: Searches for "vice" substring from "stats" column and replaces
# the same with "w You See Me" and pass it as input to
# DataFrame.assign().
>>> res_df = df.assign(col=df.stats.regexp_replace("vice", "w You See Me", 1, 1, 'c'))
>>> print(res_df)
masters gpa stats programming admitted col
id
5 no 3.44 Novice Novice 0 Now You See Me
34 yes 3.85 Advanced Beginner 0 Advanced
13 no 4.00 Advanced Novice 1 Advanced
40 yes 3.95 Novice Beginner 0 Now You See Me
22 yes 3.46 Novice Beginner 0 Now You See Me
19 yes 1.98 Advanced Advanced 0 Advanced
36 no 3.00 Advanced Novice 0 Advanced
15 yes 4.00 Advanced Advanced 1 Advanced
7 yes 2.33 Novice Novice 1 Now You See Me
17 no 3.83 Advanced Advanced 1 Advanced