DataFrame Manipulation

Teradata® Python Package User Guide

Teradata Vantage
Teradata Python Package
User Guide

You can manipulate a DataFrame with methods and operators. The functions DataFrame() and DataFrame.from_table() and DataFrame.from_query() have the same methods and operators.

DataFrame Methods

A DataFrame method has the basic syntax DataFrame_instance.method(arguments). Using the specified DataFrame and arguments, the method returns a new DataFrame. The specified DataFrame remains unchanged.

DataFrame method returns instance of teradataml DataFrameGroupBy, which is inherited from teradataml DataFrame.
DataFrame Method Description
assign() Method Assigns new column expressions in DataFrame_instance.
describe() Method Generates statistics for numeric columns. Computes the count, mean, std, min, percentiles, and max for numeric columns.
drop() Method Drops specified labels from rows or columns in DataFrame_instance.
dropna() Method Removes rows with null values in DataFrame_instance.
filter() Method Returns only the filtered columns or rows (based on the index) of DataFrame_instance. Filter is item, like, or regex.

Other filters are operators index[] and loc[].

get() Method Retrieves required columns from DataFrame using column names as key.
get_values() Method Retrieves all values (only) present in a teradataml DataFrame.
groupby() Method Returns all columns of DataFrame_instance, grouped as specified.
head() Method Returns the first n rows of DataFrame_instance.
join() Method Joins two different teradataml DataFrames together.
merge() Method Merges two teradataml DataFrames together.
select() Method Returns only the selected columns of DataFrame_instance.
set_index() Method Assigns one or more existing columns as the new index to a teradataml DataFrame.
sort() Method Returns all columns of DataFrame_instance, sorted as specified.
tail() Method Returns the last n rows of the sorted teradataml DataFrame.
Aggregate Methods
DataFrame Method Description
agg() Method Applies specified aggregate methods to specified columns of DataFrame_instance.
count() Method Returns column-wise count of DataFrame_instance.
max() Method Returns column-wise maximum value of DataFrame_instance.
mean() Method Returns column-wise mean value of DataFrame_instance.
min() Method Returns column-wise minimum value of DataFrame_instance.
std() Method Returns column-wise standard deviation value of DataFrame_instance.
sum() Method Returns column-wise sum value of DataFrame_instance.
The describe() and get_values() DataFrame methods do not return a teradataml DataFrame.


A DataFrame operator has the basic syntax DataFrame_instance.loc[arguments] or DataFrame_instance[arguments].

All operators are filters. Another filter is the filter() method.

DataFrame Operator Description
index[] Operator Returns only the filtered rows of DataFrame_instance. Filter uses logical expressions composed of DataFrame columns and Python literals.
loc[] Operator Returns new DataFrame that has only the filtered columns and rows of DataFrame_instance accessed by labels.
iloc[] Operator Returns new DataFrame that has only the filtered columns and rows of DataFrame_instance accessed by integer values.