DataFrame Manipulation | Teradata Python Package - 17.00 - DataFrame Manipulation - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Release Date
November 2021
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)

You can manipulate a DataFrame with methods and operators. The DataFrames created using the DataFrame() constructor, or the DataFrame() and DataFrame.from_table() and DataFrame.from_query() functions have the same methods and operators.

DataFrame Methods

A DataFrame method has the basic syntax DataFrame.method(arguments). Using the specified DataFrame and arguments, the method returns a new DataFrame. The specified DataFrame remains unchanged.

DataFrame Method Description
assign() Method Assigns new column expressions in a DataFrame.
concat() Method Concatenate two teradataml DataFrame objects along the index axis.
describe() Method Generates statistics for numeric columns. Computes the count, mean, std, min, percentiles, and max for numeric columns.
drop() Method Drops specified labels from rows or columns in a DataFrame.
dropna() Method Removes rows with null values in a DataFrame.
filter() Method Returns only the filtered columns or rows (based on the index) of a DataFrame.

Filter is item, like, or regex.

Other filters are operators index[] and loc[].

get() Method Retrieves required columns from a DataFrame using column names as key.
get_values() Method Retrieves all values (only) present in a DataFrame.
groupby() Method Returns all columns of a DataFrame, grouped as specified.
head() Method Returns the first n rows of a DataFrame.
join() Method Joins two different teradataml DataFrames together.
map_row() Method Applies a function to every row in a teradataml DataFrame and returns a teradataml DataFrame.
map_partition() Method Applies a function to a group or partition of rows in a teradataml DataFrame and returns a teradataml DataFrame.
merge() Method Merges two teradataml DataFrames together.
sample() Method Samples rows from a DataFrame, directly or based on conditions.
select() Method Returns only the selected columns of a DataFrame.
set_index() Method Assigns one or more existing columns as the new index to a DataFrame.
show_query() Method Returns underlying SQL for the teradataml DataFrame.
sort() Method Returns all columns of a DataFrame, sorted as specified.
sort_index() Method Returns sorted objects by labels (along an axis) in either ascending or descending order for a DataFrame.
squeeze() Method Squeezes one-dimensional axis objects into a scalar for DataFrames with a single element, or a Series object for a DataFrame with a single column.
tail() Method Returns the last n rows of the sorted DataFrame.

For a list of regular aggregate functions supported by DataFrame, see Regular Aggregate Functions Supported by DataFrame.

For a list of time series aggregate functions, see Time Series Aggregate Functions.

The describe() and get_values() methods do not return a teradataml DataFrame.
The groupby() method returns instance of teradataml DataFrameGroupBy, which is inherited from teradataml DataFrame.
The groupby_time() and resample() methods return instance of teradataml DataFrameGroupByTime, which is inherited from teradataml DataFrame.

Operators

A DataFrame operator has the basic syntax as follows:
  • DataFrame.loc[arguments]
  • DataFrame.iloc[arguments]
  • DataFrame[arguments]

All operators are filters. Another filter is the filter() method.

DataFrame Operator Description
index[] Operator Returns only the filtered rows of a DataFrame.

Filter uses logical expressions composed of DataFrame columns and Python literals.

loc[] Operator Returns new DataFrame that has only the filtered columns and rows of a DataFrame accessed by labels.
iloc[] Operator Returns new DataFrame that has only the filtered columns and rows of a DataFrame accessed by integer values.