dropna() Method | Teradata Python Package - 17.00 - dropna() Method - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Release Date
November 2021
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)

Use the dropna() method to remove rows with null values in a DataFrame.

Arguments:
  • how: optional argumentv specifies how rows are removed. It has options 'any' or 'all'.
    • 'any': Removes rows with at least one null value.
    • 'all': Removes rows with all null values.
    The default is 'any'.
  • thresh: optional argument specifies the minimum number of non-null values in a row to include.
  • subset: optional argument specifies list of column names to include, in array-like format. Use this argument to limit the search for null values to specific columns.

Examples Prerequisite

Assume the table "sales" exists. And a DataFrame "df" is created using the command:

>>> df = DataFrame("sales")
>>> df
                Feb   Jan   Mar   Apr    datetime
accounts
Jones LLC     200.0   150   140   180  2017-04-01
Yellow Inc     90.0  None  None  None  2017-04-01
Orange Inc    210.0  None  None   250  2017-04-01
Blue Inc       90.0    50    95   101  2017-04-01
Alpha Co      210.0   200   215   250  2017-04-01
Red Inc       200.0   150   140  None  2017-04-01

Example 1: Drop rows with at least one Null value

>>> df.dropna()
                Feb  Jan  Mar  Apr    datetime
accounts
Blue Inc       90.0   50   95  101  2017-04-01
Jones LLC     200.0  150  140  180  2017-04-01
Alpha Co      210.0  200  215  250  2017-04-01

Example 2: Keep rows with at least four Non-Null values

>>> df.dropna(thresh=4)
              Feb   Jan   Mar   Apr    datetime
accounts
Jones LLC   200.0   150   140   180  2017-04-01
Blue Inc     90.0    50    95   101  2017-04-01
Orange Inc  210.0  None  None   250  2017-04-01
Alpha Co    210.0   200   215   250  2017-04-01
Red Inc     200.0   150   140  None  2017-04-01

Example 3: Keep rows with at least five Non-Null values

>>> df.dropna(thresh=5)
             Feb  Jan  Mar   Apr    datetime
accounts
Alpha Co   210.0  200  215   250  2017-04-01
Jones LLC  200.0  150  140   180  2017-04-01
Blue Inc    90.0   50   95   101  2017-04-01
Red Inc    200.0  150  140  None  2017-04-01

Example 4: Drop rows with all Null values in columns 'Jan' and 'Mar'

>>> df.dropna(how='all', subset=['Jan','Mar'])
             Feb  Jan  Mar   Apr    datetime
accounts
Alpha Co   210.0  200  215   250  2017-04-01
Jones LLC  200.0  150  140   180  2017-04-01
Red Inc    200.0  150  140  None  2017-04-01
Blue Inc    90.0   50   95   101  2017-04-01