Teradata Package for Python Function Reference on VantageCloud Lake - replace - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference on VantageCloud Lake
- Deployment
- VantageCloud
- Edition
- Lake
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Lake_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.dataframe.DataFrame.replace = replace(self, to_replace, value=None, subset=None)
- DESCRIPTION:
Function replaces every occurrence of "to_replace" with the "value"
in the columns mentioned in "subset". When "subset" is not provided,
function replaces in all columns.
PARAMETERS:
to_replace:
Required Argument.
Specifies a ColumnExpression or a literal that the function
searches for values in the Column. Use ColumnExpression when
you want to match the condition based on a DataFrameColumn
function, else use literal.
Note:
Only ColumnExpressions generated from DataFrameColumn
functions are supported. BinaryExpressions are not supported.
Example: Consider teradataml DataFrame has two columns COL1, COL2.
df.COL1.abs() is supported but df.COL1 == df.COL2 is not
supported.
Supported column types: CHAR, VARCHAR, FLOAT, INTEGER, DECIMAL
Types: ColumnExpression OR int OR float OR str OR dict
value:
Required argument when "to_replace" is not a dictionary. Optional otherwise.
Specifies a ColumnExpression or a literal that replaces
the "to_replace" in the column. Use ColumnExpression when
you want to replace based on a DataFrameColumn function, else
use literal.
Notes:
* Argument is ignored if "to_replace" is a dictionary.
* Only ColumnExpressions generated from DataFrameColumn
functions are supported. BinaryExpressions are not supported.
Example: Consider teradataml DataFrame has two columns COL1, COL2.
df.COL1.abs() is supported but df.COL1 == df.COL2 is not
supported.
Supported column types: CHAR, VARCHAR, FLOAT, INTEGER, DECIMAL
Types: ColumnExpression OR int OR float OR str
subset:
Optional Argument.
Specifies column(s) to consider for replacing the values.
Types: ColumnExpression OR str OR list
RAISES:
TeradataMlException
RETURNS:
teradataml DataFrame
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
# Create a DataFrame on 'admissions_train' table.
>>> df = DataFrame("admissions_train")
>>> print(df)
masters gpa stats programming admitted
id
15 yes 4.00 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
40 yes 3.95 Novice Beginner 0
7 yes 2.33 Novice Novice 1
22 yes 3.46 Novice Beginner 0
26 yes 3.57 Advanced Advanced 1
17 no 3.83 Advanced Advanced 1
# Example 1: Replace the string 'Advanced' with 'Good' in columns 'stats'
# and 'programming'.
>>> res = df.replace("Advanced", "Good", subset=["stats", "programming"])
>>> print(res)
masters gpa stats programming admitted
id
13 no 4.00 Good Novice 1
36 no 3.00 Good Novice 0
15 yes 4.00 Good Good 1
40 yes 3.95 Novice Beginner 0
22 yes 3.46 Novice Beginner 0
38 yes 2.65 Good Beginner 1
26 yes 3.57 Good Good 1
5 no 3.44 Novice Novice 0
7 yes 2.33 Novice Novice 1
19 yes 1.98 Good Good 0
# Example 2: Replace the string 'Advanced' with 'Good' and 'Beginner' with 'starter'
# in columns 'stats' and 'programming'.
>>> res = df.replace({"Advanced": "Good", "Beginner": "starter"}, subset=["stats", "programming"])
>>> print(res)
masters gpa stats programming admitted
id
15 yes 4.00 Good Good 1
7 yes 2.33 Novice Novice 1
22 yes 3.46 Novice starter 0
17 no 3.83 Good Good 1
13 no 4.00 Good Novice 1
38 yes 2.65 Good starter 1
26 yes 3.57 Good Good 1
5 no 3.44 Novice Novice 0
34 yes 3.85 Good starter 0
40 yes 3.95 Novice starter 0
# Example 3: Append the string '_New' to 'stats' column when values in
# 'programming' and 'stats' are same.
>>> res = df.replace({df.programming: df.stats+"_New"}, subset=["stats"])
>>> print(res)
masters gpa stats programming admitted
id
15 yes 4.00 Advanced_New Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice_New Novice 0
40 yes 3.95 Novice Beginner 0
7 yes 2.33 Novice_New Novice 1
22 yes 3.46 Novice Beginner 0
26 yes 3.57 Advanced_New Advanced 1
17 no 3.83 Advanced_New Advanced 1
# Example 4: Round the values of gpa to it's nearest integer.
>>> res = df.replace({df.gpa: df.gpa.round(0)}, subset=["gpa"])
>>> print(res)
masters gpa stats programming admitted
id
15 yes 4.0 Advanced Advanced 1
7 yes 2.0 Novice Novice 1
22 yes 3.0 Novice Beginner 0
17 no 4.0 Advanced Advanced 1
13 no 4.0 Advanced Novice 1
38 yes 3.0 Advanced Beginner 1
26 yes 4.0 Advanced Advanced 1
5 no 3.0 Novice Novice 0
34 yes 4.0 Advanced Beginner 0
40 yes 4.0 Novice Beginner 0
# Example 5: Replace the value of masters with '1' if value is 'yes'
# and with '0' if value is no.
>>> res = df.replace({'yes': 1, 'no': 0}, subset=["masters"])
>>> print(res)
masters gpa stats programming admitted
id
15 1 4.00 Advanced Advanced 1
7 1 2.33 Novice Novice 1
22 1 3.46 Novice Beginner 0
17 0 3.83 Advanced Advanced 1
13 0 4.00 Advanced Novice 1
38 1 2.65 Advanced Beginner 1
26 1 3.57 Advanced Advanced 1
5 0 3.44 Novice Novice 0
34 1 3.85 Advanced Beginner 0
40 1 3.95 Novice Beginner 0