You can manipulate a DataFrame with methods and operators. The DataFrames created using the DataFrame() constructor, or the DataFrame() and DataFrame.from_table() and DataFrame.from_query() functions have the same methods and operators.
DataFrame Methods
A DataFrame method has the basic syntax DataFrame_instance.method(arguments). Using the specified DataFrame and arguments, the method returns a new DataFrame. The specified DataFrame remains unchanged.
DataFrame Method | Description |
---|---|
assign() Method | Assigns new column expressions in DataFrame_instance. |
concat() Method | Concatenate two teradataml DataFrame objects along the index axis. |
describe() in Regular Aggregate Mode | Generates statistics for numeric columns. Computes the count, mean, std, min, percentiles, and max for numeric columns. |
drop() Method | Drops specified labels from rows or columns in DataFrame_instance. |
dropna() Method | Removes rows with null values in DataFrame_instance. |
filter() Method | Returns only the filtered columns or rows (based on the index) of DataFrame_instance. Filter is item, like, or regex. Other filters are operators index[] and loc[]. |
get() Method | Retrieves required columns from DataFrame using column names as key. |
get_values() Method | Retrieves all values (only) present in a teradataml DataFrame. |
groupby() Method | Returns all columns of DataFrame_instance, grouped as specified. |
head() Method | Returns the first n rows of DataFrame_instance. |
join() Method | Joins two different teradataml DataFrames together. |
merge() Method | Merges two teradataml DataFrames together. |
sample() Method | Samples rows from a DataFrame, directly or based on conditions. |
select() Method | Returns only the selected columns of DataFrame_instance. |
set_index() Method | Assigns one or more existing columns as the new index to a teradataml DataFrame. |
sort() Method | Returns all columns of DataFrame_instance, sorted as specified. |
sort_index() Method | Return sorted objects by labels (along an axis) in either ascending or descending order for a teradataml DataFrame. |
squeeze() Method | Squeeze one-dimensional axis objects into a scalar for teradataml DataFrames with a single element, or a Series object for a teradataml DataFrame with a single column. |
tail() Method | Returns the last n rows of the sorted teradataml DataFrame. |
DataFrame Method | Description |
---|---|
agg() Method | Applies specified aggregate methods to specified columns of DataFrame_instance. |
count() in Regular Aggregate Mode | Returns column-wise count of DataFrame_instance. |
max() Method | Returns column-wise maximum value of DataFrame_instance. |
mean() Method | Returns column-wise mean value of DataFrame_instance. |
median() in Regular Aggregate Mode | Returns column-wise median value of a DataFrame. |
min() Method | Returns column-wise minimum value of DataFrame_instance. |
std() Method | Returns column-wise standard deviation value of DataFrame_instance. |
sum() Method | Returns column-wise sum value of DataFrame_instance. |
var() Method | Returns column-wise unbiased variance value of the DataFrame. |
DataFrame Method | Description |
---|---|
bottom() | Returns the smallest number of values in the columns for each group, with or without ties. |
count() in Time Series Aggregate Mode | Returns column-wise count of the DataFrame |
delta_t() | Calculates time differences, or DELTA_T, between a starting and an ending event. |
describe() in Time Series Aggregate Mode | Generates statistics for numeric columns. |
first() | Returns the oldest value, determined by the timecode, for each group. |
groupby_time() | Resamples time series data to group the same by time on a datetime column of a DataFrame. |
last() | Returns the newest value, determined by the timecode, for each group. |
mad() | Returns the median of the set of values defined as the absolute value of the difference between each value and the median of all values in each group. |
median() in Time Series Aggregate Mode | Returns column-wise median value of the dataframe. |
mode() | Returns the column-wise mode of all values in each group. |
resample() | Resamples time series data to group the same by time on a datetime column of a DataFrame. |
top() | Returns the largest number of values in the columns for each group, with or without ties. |
The describe() and get_values() methods do not return a teradataml DataFrame.
The groupby() method returns instance of teradataml DataFrameGroupBy, which is inherited from teradataml DataFrame.
The groupby_time() and resample() methods return instance of teradataml DataFrameGroupByTime, which is inherited from teradataml DataFrame.
Operators
A DataFrame operator has the basic syntax as follows:
- DataFrame_instance.loc[arguments]
- DataFrame_instance.iloc[arguments]
- DataFrame_instance[arguments]
All operators are filters. Another filter is the filter() method.
DataFrame Operator | Description |
---|---|
index[] Operator | Returns only the filtered rows of DataFrame_instance. Filter uses logical expressions composed of DataFrame columns and Python literals. |
loc[] Operator | Returns new DataFrame that has only the filtered columns and rows of DataFrame_instance accessed by labels. |
iloc[] Operator | Returns new DataFrame that has only the filtered columns and rows of DataFrame_instance accessed by integer values. |