Use the replace() function to replace every occurrence of to_replace with the value in the columns mentioned in subset. When subset is not provided, this function replaces occurrences of to_replace in all columns.
Required Arguments:
- to_replace: Specifies a ColumnExpression or a literal that the function searches for values in the Column. Use ColumnExpression when you want to match the condition based on a DataFrameColumn function, else use literal.
Supported column types: CHAR, VARCHAR, FLOAT, INTEGER, DECIMAL.
Only ColumnExpressions generated from DataFrameColumn functions are supported. BinaryExpressions are not supported.For example: Consider teradataml DataFrame has two columns COL1, COL2. df.COL1.abs() is supported but df.COL1 == df.COL2 is not supported.
- value: Specifies a ColumnExpression or a literal that replaces the to_replace in the column. Use ColumnExpression only when the value you want to replace is based on a DataFrameColumn function; otherwise, use literal.
Supported column types: CHAR, VARCHAR, FLOAT, INTEGER, DECIMAL.
This argument is required argument when to_replace is not a dictionary. Otherwise, it is optional and ignored.
Only ColumnExpressions generated from DataFrameColumn functions are supported. BinaryExpressions are not supported.For example: Consider teradataml DataFrame has two columns COL1, COL2. df.COL1.abs() is supported but df.COL1 == df.COL2 is not supported.
Optional Arguments:
- subset: Specifies the columns to consider for replacing the values.
Example Setup
- Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
- Create a DataFrame on 'admissions_train' table.
>>> df = DataFrame("admissions_train")
>>> print(df) masters gpa stats programming admitted id 15 yes 4.00 Advanced Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice Novice 0 40 yes 3.95 Novice Beginner 0 7 yes 2.33 Novice Novice 1 22 yes 3.46 Novice Beginner 0 26 yes 3.57 Advanced Advanced 1 17 no 3.83 Advanced Advanced 1
Example 1: Replace the string 'Advanced' with 'Good' in columns 'stats' and 'programming'
>>> res = df.replace({"Advanced": "Good", "Beginner": "starter"}, subset=["stats", "programming"])
>>> print(res) masters gpa stats programming admitted id 15 yes 4.00 Good Good 1 7 yes 2.33 Novice Novice 1 22 yes 3.46 Novice starter 0 17 no 3.83 Good Good 1 13 no 4.00 Good Novice 1 38 yes 2.65 Good starter 1 26 yes 3.57 Good Good 1 5 no 3.44 Novice Novice 0 34 yes 3.85 Good starter 0 40 yes 3.95 Novice starter 0
Example 2: Replace the string 'Advanced' with 'Good' and 'Beginner' with 'starter' in columns 'stats' and 'programming'
>>> res = df.replace({"Advanced": "Good", "Beginner": "starter"}, subset=["stats", "programming"])
>>> print(res) masters gpa stats programming admitted id 15 yes 4.00 Good Good 1 7 yes 2.33 Novice Novice 1 22 yes 3.46 Novice starter 0 17 no 3.83 Good Good 1 13 no 4.00 Good Novice 1 38 yes 2.65 Good starter 1 26 yes 3.57 Good Good 1 5 no 3.44 Novice Novice 0 34 yes 3.85 Good starter 0 40 yes 3.95 Novice starter 0
Example 3: Append the string '_New' to 'stats' column when values in 'programming' and 'stats' are same
>>> res = df.replace({df.programming: df.stats+"_New"}, subset=["stats"])
>>> print(res) masters gpa stats programming admitted id 15 yes 4.00 Advanced_New Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice_New Novice 0 40 yes 3.95 Novice Beginner 0 7 yes 2.33 Novice_New Novice 1 22 yes 3.46 Novice Beginner 0 26 yes 3.57 Advanced_New Advanced 1 17 no 3.83 Advanced_New Advanced 1
Example 4: Round the values of 'gpa' to its nearest integer
>>> res = df.replace({df.gpa: df.gpa.round(0)}, subset=["gpa"])
>>> print(res) masters gpa stats programming admitted id 15 yes 4.0 Advanced Advanced 1 7 yes 2.0 Novice Novice 1 22 yes 3.0 Novice Beginner 0 17 no 4.0 Advanced Advanced 1 13 no 4.0 Advanced Novice 1 38 yes 3.0 Advanced Beginner 1 26 yes 4.0 Advanced Advanced 1 5 no 3.0 Novice Novice 0 34 yes 4.0 Advanced Beginner 0 40 yes 4.0 Novice Beginner 0
Example 5: Replace the value of 'masters' with '1' if value is 'yes' and with '0' if value is 'no'
>>> res = df.replace({'yes': 1, 'no': 0}, subset=["masters"])
>>> print(res) masters gpa stats programming admitted id 15 1 4.00 Advanced Advanced 1 7 1 2.33 Novice Novice 1 22 1 3.46 Novice Beginner 0 17 0 3.83 Advanced Advanced 1 13 0 4.00 Advanced Novice 1 38 1 2.65 Advanced Beginner 1 26 1 3.57 Advanced Advanced 1 5 0 3.44 Novice Novice 0 34 1 3.85 Advanced Beginner 0 40 1 3.95 Novice Beginner 0