dropna() Method

Teradata® Python Package User Guide

brand
Teradata Vantage
prodname
Teradata Python Package
vrm_release
16.20
category
User Guide
featnum
B700-4006-098K

Use the dropna() method to remove rows with null values in a DataFrame.

The how argument has options 'any' or 'all'.
  • 'any': Removes rows with at least one null value.
  • 'all': Removes rows with all null values.
The default is 'any'.

Use the thresh argument to specify the minimum number of non-null values in a row to include.

Use the subset argument to limit the search for null values to specific columns.

Examples Prerequisite

Assume the table "sales" exists. And a DataFrame "df" is created using the command:

>>> df = DataFrame("sales")
>>> df
                Feb   Jan   Mar   Apr    datetime
accounts
Jones LLC     200.0   150   140   180  2017-04-01
Yellow Inc     90.0  None  None  None  2017-04-01
Orange Inc    210.0  None  None   250  2017-04-01
Blue Inc       90.0    50    95   101  2017-04-01
Alpha Co      210.0   200   215   250  2017-04-01
Red Inc       200.0   150   140  None  2017-04-01

Example: Drop rows with at least one Null value

>>> df.dropna()
                Feb  Jan  Mar  Apr    datetime
accounts
Blue Inc       90.0   50   95  101  2017-04-01
Jones LLC     200.0  150  140  180  2017-04-01
Alpha Co      210.0  200  215  250  2017-04-01

Example: Keep rows with at least four Non-Null values

>>> df.dropna(thresh=4)
              Feb   Jan   Mar   Apr    datetime
accounts
Jones LLC   200.0   150   140   180  2017-04-01
Blue Inc     90.0    50    95   101  2017-04-01
Orange Inc  210.0  None  None   250  2017-04-01
Alpha Co    210.0   200   215   250  2017-04-01
Red Inc     200.0   150   140  None  2017-04-01

Example: Keep rows with at least five Non-Null values

>>> df.dropna(thresh=5)
             Feb  Jan  Mar   Apr    datetime
accounts
Alpha Co   210.0  200  215   250  2017-04-01
Jones LLC  200.0  150  140   180  2017-04-01
Blue Inc    90.0   50   95   101  2017-04-01
Red Inc    200.0  150  140  None  2017-04-01

Example: Drop rows with all Null values in columns 'Jan' and 'Mar'

>>> df.dropna(how='all', subset=['Jan','Mar'])
             Feb  Jan  Mar   Apr    datetime
accounts
Alpha Co   210.0  200  215   250  2017-04-01
Jones LLC  200.0  150  140   180  2017-04-01
Red Inc    200.0  150  140  None  2017-04-01
Blue Inc    90.0   50   95   101  2017-04-01