Use the distinct() function to generate a new ColumnExpression which removes duplicate rows while processing the function.
- This function is supported only in projection.
- This function is neither supported in sorting data nor supported in filtering data.
Example
>>> from teradataml import *
>>> load_example_data("dataframe","sales")
>>> df = DataFrame.from_table('sales')
>>> df
Feb Jan Mar Apr datetime accounts Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 Yellow Inc 90.0 NaN NaN NaN 04/01/2017 Orange Inc 210.0 NaN NaN 250.0 04/01/2017 Red Inc 200.0 150.0 140.0 NaN 04/01/2017
>>> df.assign(drop_columns=True, distinct_feb=df.Feb.distinct())
distinct_feb 0 210.0 1 90.0 2 200.0