Teradata Package for Python Function Reference | 17.10 - contains - Teradata Package for Python
Teradata® Package for Python Function Reference
- Product
- Teradata Package for Python
- Release Number
- 17.10
- Published
- April 2022
- Language
- English (United States)
- Last Update
- 2022-08-19
- Product Category
- Teradata Vantage
- teradataml.dataframe.sql.DataFrameColumn.contains = contains(self, pattern, case=True, na=None, **kw)
- Test if the regexp pattern matches strings in the Series.
PARAMETERS:
pattern: str. A regex pattern
case: bool. True if case-sensitive matching, else False for case-insensitive matching
na: bool, str, or numeric python literal. None by default.
Specifies an optional fill value for NULL values in the column
**kw: optional parameters to pass to regexp_substr
- match_arg : a string of characters to use for the match_arg parameter for REGEXP_SUBSTR
See the Reference for more information about the match_arg parameter.
Note: specifying match_arg overrides the case parameter
REFERENCE:
SQL Functions, Operators, Expressions, and Predicates
Chapter 24: Regular Expression Functions
RETURNS:
A numeric Series of values where:
- Nulls are replaced by the fill parameter
- A 1 if the value matches the pattern or else 0
The type of the series is upcasted to support the fill value, if specified.
EXAMPLES:
>>> tdf = DataFrame('iris')
>>> species = tdf['Name']
>>> tdf.assign(drop_columns = True,
Name = species,
has_setosa = species.str.contains('setosa'))
Name has_setosa
0 Iris-setosa 1
1 Iris-versicolor 0
2 Iris-setosa 1
3 Iris-versicolor 0
4 Iris-setosa 1
5 Iris-versicolor 0
6 Iris-versicolor 0
7 Iris-virginica 0
8 Iris-setosa 1
9 None None
# case-sensitive by default
>>> tdf.assign(drop_columns = True,
Name = species,
has_iris = species.str.contains('iris'))
Name has_iris
0 Iris-versicolor 0
1 Iris-versicolor 0
2 Iris-virginica 0
3 Iris-versicolor 0
4 Iris-virginica 0
5 Iris-versicolor 0
6 Iris-versicolor 0
7 Iris-virginica 0
8 Iris-setosa 0
9 None None
>>> tdf.assign(drop_columns = True,
Name = species,
has_iris = species.str.contains('iris', case = False))
Name has_iris
0 Iris-versicolor 1
1 Iris-versicolor 1
2 Iris-virginica 1
3 Iris-versicolor 1
4 Iris-virginica 1
5 Iris-versicolor 1
6 Iris-versicolor 1
7 Iris-virginica 1
8 Iris-setosa 1
9 None None
# specify a literal for null values
>>> tdf.assign(drop_columns = True,
Name = species,
has_iris = species.str.contains('iris', case = False, na = 'no value'))
Name has_iris
0 Iris-versicolor 1
1 Iris-versicolor 1
2 Iris-virginica 1
3 Iris-versicolor 1
4 Iris-virginica 1
5 Iris-versicolor 1
6 Iris-versicolor 1
7 Iris-virginica 1
8 Iris-setosa 1
9 None no value
# filter where Name has 'setosa'
>>> tdf[species.str.contains('setosa') == True].select('Name')
Name
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
5 Iris-setosa
6 Iris-setosa
7 Iris-setosa
8 Iris-setosa
9 Iris-setosa
# filter where Name does not have 'setosa'
>>> tdf[species.str.contains('setosa') == False].select('Name')
Name
0 Iris-versicolor
1 Iris-versicolor
2 Iris-versicolor
3 Iris-versicolor
4 Iris-versicolor
5 Iris-virginica
6 Iris-virginica
7 Iris-virginica
8 Iris-versicolor
9 Iris-virginica
# you can use numeric literals for True (1) and False (0)
>>> tdf[species.str.contains('setosa') == 1].select('Name')
Name
0 Iris-setosa
1 Iris-setosa
2 Iris-setosa
3 Iris-setosa
4 Iris-setosa
5 Iris-setosa
6 Iris-setosa
7 Iris-setosa
8 Iris-setosa
9 Iris-setosa
>>> tdf[species.str.contains('setosa') == 0].select('Name')
Name
0 Iris-versicolor
1 Iris-versicolor
2 Iris-versicolor
3 Iris-versicolor
4 Iris-versicolor
5 Iris-virginica
6 Iris-virginica
7 Iris-virginica
8 Iris-versicolor
9 Iris-virginica