Teradata Package for Python Function Reference | 17.10 - dropna - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference
- Product
- Teradata Package for Python
- Release Number
- 17.10
- Published
- April 2022
- Language
- English (United States)
- Last Update
- 2022-08-19
- lifecycle
- previous
- Product Category
- Teradata Vantage
- teradataml.geospatial.geodataframe.GeoDataFrame.dropna = dropna(self, how='any', thresh=None, subset=None)
- DESCRIPTION:
Removes rows with null values.
PARAMETERS:
how:
Optional Argument.
Specifies how rows are removed.
'any' removes rows with at least one null value.
'all' removes rows with all null values.
Default Value: 'any'
Permitted Values: 'any' or 'all'
Types: str
thresh:
Optional Argument.
Specifies the minimum number of non null values in a row to include.
Types: int
subset:
Optional Argument.
Specifies list of column names to include, in array-like format.
Types: str OR list of Strings (str)
RETURNS:
teradataml GeoDataFrame
RAISE:
TeradataMlException
EXAMPLES:
>>> load_example_data("geodataframe","sample_shapes")
>>> df = GeoDataFrame('sample_shapes').select(["skey", "geosequence"])
>>> df
geosequence
skey
1006 None
1001 GEOSEQUENCE((10 20,30 40,50 60),(2007-08-22 12:05:23.560000,2007-08-22 12:08:25.140000,2007-08-22 12:11:41.520000),(1,2,3),(2,10,12,11,18,21,19))
1002 GEOSEQUENCE((10 10,15 15,-2 0),(2007-03-14 01:35:00.000000,2007-03-14 01:35:05.000000,2007-03-14 01:35:08.000000),(1222,1223,1224),(2,12.1,3.14159,2.78128,-10,-11,100.1))
1010 None
1004 None
1003 None
1008 None
1005 None
1007 None
1009 None
>>>
# Drop the rows where at least one element is null.
>>> df.dropna()
geosequence
skey
1002 GEOSEQUENCE((10 10,15 15,-2 0),(2007-03-14 01:35:00.000000,2007-03-14 01:35:05.000000,2007-03-14 01:35:08.000000),(1222,1223,1224),(2,12.1,3.14159,2.78128,-10,-11,100.1))
1001 GEOSEQUENCE((10 20,30 40,50 60),(2007-08-22 12:05:23.560000,2007-08-22 12:08:25.140000,2007-08-22 12:11:41.520000),(1,2,3),(2,10,12,11,18,21,19))
>>>
# Drop the rows where all elements are nulls for columns 'geosequence' and 'skey'.
>>> df.dropna(how='all', subset=['skey','geosequence'])
geosequence
skey
1008 None
1003 None
1009 None
1007 None
1005 None
1001 GEOSEQUENCE((10 20,30 40,50 60),(2007-08-22 12:05:23.560000,2007-08-22 12:08:25.140000,2007-08-22 12:11:41.520000),(1,2,3),(2,10,12,11,18,21,19))
1006 None
1004 None
1010 None
1002 GEOSEQUENCE((10 10,15 15,-2 0),(2007-03-14 01:35:00.000000,2007-03-14 01:35:05.000000,2007-03-14 01:35:08.000000),(1222,1223,1224),(2,12.1,3.14159,2.78128,-10,-11,100.1))
>>>
# Keep only the rows with at least 2 non null values.
>>> df.dropna(thresh=2)
geosequence
skey
1001 GEOSEQUENCE((10 20,30 40,50 60),(2007-08-22 12:05:23.560000,2007-08-22 12:08:25.140000,2007-08-22 12:11:41.520000),(1,2,3),(2,10,12,11,18,21,19))
1002 GEOSEQUENCE((10 10,15 15,-2 0),(2007-03-14 01:35:00.000000,2007-03-14 01:35:05.000000,2007-03-14 01:35:08.000000),(1222,1223,1224),(2,12.1,3.14159,2.78128,-10,-11,100.1))
>>>