Teradata Package for Python Function Reference | 17.10 - regexp_instr - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.10

Published

April 2022

Language

English (United States)

Last Update

2022-08-19

lifecycle

Product Category

Teradata Vantage

regexp_instr

Functions
		regexp_instr(source_string, regexp_string, position_arg, occurrence_arg, return_opt, match_arg) DESCRIPTION: Function searches source_string for a match to regexp_string. PARAMETERS: source_string: Required Argument. Specifies a ColumnExpression of a string column or a string literal to search for a match. Format for the argument: '<dataframe>.<dataframe_column>.expression'. If source_string is NULL, NULL is returned. regexp_string: Required Argument. Specifies a ColumnExpression of a string column or a string literal which is to be used as regex. Format for the argument: '<dataframe>.<dataframe_column>.expression'. If regexp_string is NULL, NULL is returned. position_arg: Optional Argument. Specifies the position in source_string from which to start searching. Argument accepts a ColumnExpression of a numeric column or a numeric constant. If the value is greater than the input string length, zero is returned. If the value is NULL, the value NULL is returned. If argument is not specified, the default value (1) is used. occurrence_arg: Optional Argument. Specifies the number of the occurrence to be returned. For example, if occurrence_arg is 2, the function matches the first occurrence in source_string and starts searching from the character following the first occurrence in source_string for the second occurrence in source_string. Argument accepts a ColumnExpression of a numeric column or a numeric constant. If the value is greater than the number of matches found, 0 is returned. If the value is NULL, a NULL result is returned. If argument is not specified, the default value (1) is used. return_opt: Optional Argument. Specifies a numeric value 0 or 1 that decides return value for the match. Argument accepts a ColumnExpression of a numeric column or a numeric constant. Valid values are: * 0 = function returns the beginning position of the match (default). * 1 = function returns the end position (character following the occurrence) of the match. If the value is NULL, NULL is returned. If argument is not specified, the default value (0) value is used. match_arg: Optional Argument. Specifies a character which decides the handling of regex matching. Valid values are: * 'i' = case-insensitive matching. * 'c' = case sensitive matching. * 'n' = the period character (match any character) can match the newline character. * 'm' = source_string is treated as multiple lines instead of as a single line. With this option, the '^' and '$' characters apply to each line in source_string instead of the entire source_string. * 'l' = if source_string exceeds the current maximum allowed source_string size (currently 16 MB), a NULL is returned instead of an error. This is useful for long-running queries where you do not want long strings causing an error that would make the query fail. * 'x' = ignore whitespace. The argument can contain more than one character. Notes: 1. If a character in the argument is not valid, then that error is reported. 2. If match_arg is not specified, is NULL, or is empty: a. The match is case-sensitive. b. A period does not match the newline character. c. source_string is treated as a single line. NOTE: Function accepts positional arguments only. EXAMPLES: # Load the data to run the example. >>> load_example_data("dataframe", "admissions_train") >>> # Create a DataFrame on 'admissions_train' table. >>> admissions_train = DataFrame("admissions_train") >>> admissions_train masters gpa stats programming admitted id 22 yes 3.46 Novice Beginner 0 36 no 3.00 Advanced Novice 0 15 yes 4.00 Advanced Advanced 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice Novice 0 17 no 3.83 Advanced Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 26 yes 3.57 Advanced Advanced 1 19 yes 1.98 Advanced Advanced 0 >>> # Example searches for "d" substring in "stats" column and returns the position of the # matched string based on integer value in "admitted" column. # Import func from sqlalchemy to execute regexp_instr() function. >>> from sqlalchemy import func # Create a sqlalchemy Function object. # Note: Function name is case-insensitive. >>> regex_instr_ = func.REGEXP_INSTR(admissions_train.stats.expression, "d", 1, 1, ... admissions_train.admitted.expression, 'c') >>> # Pass the Function object as input to DataFrame.assign(). >>> df = admissions_train.assign(regex_instr_col=regex_instr_) >>> print(df) masters gpa stats programming admitted regex_instr_col id 22 yes 3.46 Novice Beginner 0 0 36 no 3.00 Advanced Novice 0 2 15 yes 4.00 Advanced Advanced 1 3 38 yes 2.65 Advanced Beginner 1 3 5 no 3.44 Novice Novice 0 0 17 no 3.83 Advanced Advanced 1 3 34 yes 3.85 Advanced Beginner 0 2 13 no 4.00 Advanced Novice 1 3 26 yes 3.57 Advanced Advanced 1 3 19 yes 1.98 Advanced Advanced 0 2 >>>