Teradata Package for Python Function Reference on VantageCloud Lake - drop_duplicate - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference on VantageCloud Lake
- Deployment
- VantageCloud
- Edition
- Lake
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Lake_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.dataframe.DataFrame.drop_duplicate = drop_duplicate(self, column_names=None)
- DESCRIPTION:
Function drops the duplicate rows, i.e., returns the distinct values from teradataml DataFrame.
PARAMETERS:
column_names:
Optional argument.
Specifies the name(s) of the column(s) to drop the duplicates, i.e., to get the
distinct values. If not specified, all columns in the DataFrame are considered for
the operation.
Types: str OR list of Strings (str)
RETURNS:
teradataml DataFrame
RAISES:
TeradataMlException
EXAMPLES:
# Create a teradataml DataFrame.
>>> load_example_data("dataframe","admissions_train")
>>> df = DataFrame('admissions_train')
>>> df
masters gpa stats programming admitted
id
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
5 no 3.44 Novice Novice 0
19 yes 1.98 Advanced Advanced 0
15 yes 4.00 Advanced Advanced 1
40 yes 3.95 Novice Beginner 0
7 yes 2.33 Novice Novice 1
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
38 yes 2.65 Advanced Beginner 1
# Example 1: Get the distinct rows of values for the column programming.
>>> df.drop_duplicate('programming')
programming
0 Beginner
1 Advanced
2 Novice
# Example 2: Get the distinct rows of values for the columns programming and admitted.
>>> df.drop_duplicate(['programming','admitted'])
programming admitted
0 Advanced 1
1 Novice 0
2 Novice 1
3 Beginner 1
4 Advanced 0
5 Beginner 0