drop_duplicate() | Teradata Package for Python - drop_duplicate() Function - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
March 2024
Language
English (United States)
Last Update
2024-04-09
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Use the the drop_duplicate() function to drop duplicate rows from teradataml DataFrame to return distinct values from the DataFrame.

Optional Argument:
  • column_names: Specifies the names of the columns to drop the duplicate values of, to get the distinct values.

    If not specified, all columns in the DataFrame are considered for the operation.

Example Setup

In this example, "admission_train" dataset is used.

>>> from teradataml import *
>>> load_example_data("dataframe", "admissions_train")
>>> df = DataFrame("admissions_train")
# Print dataframe.
>>> df
      masters   gpa     stats programming admitted
   id
   13      no  4.00  Advanced      Novice        1
   26     yes  3.57  Advanced    Advanced        1
   5       no  3.44    Novice      Novice        0
   19     yes  1.98  Advanced    Advanced        0
   15     yes  4.00  Advanced    Advanced        1
   40     yes  3.95    Novice    Beginner        0
   7      yes  2.33    Novice      Novice        1
   22     yes  3.46    Novice    Beginner        0
   36      no  3.00  Advanced      Novice        0
   38     yes  2.65  Advanced    Beginner        1

Example 1: Get the distinct rows of values for the column 'programming'

>>> df.drop_duplicate("programming")
  programming
0      Novice
1    Beginner
2    Advanced

Example 2: Get the distinct rows of values for the columns 'programming' and 'admitted'

>>> df.drop_duplicate(["programming","admitted"])
  programming  admitted
0    Beginner         0
1    Advanced         1
2    Beginner         1
3    Advanced         0
4      Novice         1
5      Novice         0