Window on DataFrame Column | Teradata Package for Python - Window on DataFrame Column - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Published
November 2021
Language
English (United States)
Last Update
2022-01-14
dita:mapPath
bol1585763678431.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
B700-4006
lifecycle
previous
Product Category
Teradata Vantage

Use the window() function to run window aggregate functions on the teradataml DataFrame columns.

The function allows user to specify the following window types for computations:
  • Cumulative
  • Group
  • Moving
  • Remaining
By default, window with Unbounded Preceding and Unbounded Following is considered for calculation.
If both partition_columns and order_columns are 'None', then window cannot be created on CLOB and BLOB type of columns.

Example Setup

>>> from teradataml import *
>>> load_example_data("dataframe","sales")
>>> df = DataFrame.from_table('sales')

Example 1: Create a window on a teradataml DataFrame Column

>>> window = df.Feb.window()

Example 2: Create a cumulative (expanding) window

This example creates a cumulative (expanding) window with rows between unbounded preceding and 3 preceding, with partition_columns and order_columns arguments, and with default sorting.

>>> window = df.window(partition_columns="Feb",
...                    order_columns=["Feb", "datetime"],
...                    window_start_point=None,
...                    window_end_point=-3)

Example 3: Create a moving (rolling) window

This example creates a moving (rolling) window with rows between current row and 3 following, with sorting done on 'Feb', 'datetime' columns in descending order, and with partition_columns argument.

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       sort_ascending=False,
                       window_start_point=0,
                       window_end_point=3)

Example 4: Create a remaining (contracting) window

This example creates a remaining (contracting) window with rows between current row and unbounded following, with sorting done on 'Feb', 'datetime' columns in ascending order, and NULL values in 'Feb', 'datetime' columns appear at last.

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       nulls_first=False,
                       window_start_point=0,
                       window_end_point=None)

Example 5: Create a grouping window

This example creates a grouping window, with sorting done on 'Feb', 'datetime' columns in ascending order, and NULL values in 'Feb', 'datetime' columns appear at last.

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       sort_ascending=False,
                       nulls_first=False,
                       window_start_point=None,
                       window_end_point=None)

Example 6: Create a window, ignoring all parameters

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       sort_ascending=False,
                       nulls_first=False,
                       ignore_window=True)

Example 7: Perform sum of 'Feb' and attach new column to the DataFrame

>>> window = df.Feb.window()
>>> df.assign(feb_sum=window.sum())
              Feb    Jan    Mar    Apr    datetime  feb_sum
accounts
Jones LLC   200.0  150.0  140.0  180.0  04/01/2017   1000.0
Red Inc     200.0  150.0  140.0    NaN  04/01/2017   1000.0
Yellow Inc   90.0    NaN    NaN    NaN  04/01/2017   1000.0
Orange Inc  210.0    NaN    NaN  250.0  04/01/2017   1000.0
Blue Inc     90.0   50.0   95.0  101.0  04/01/2017   1000.0
Alpha Co    210.0  200.0  215.0  250.0  04/01/2017   1000.0

Example 8: Perform min and max operations on column 'Apr' and attach both columns to the DataFrame

>>> window = df.Apr.window()
>>> df.assign(apr_min=window.min(), apr_max=window.max())
              Feb    Jan    Mar    Apr    datetime  apr_max  apr_min
accounts
Jones LLC   200.0  150.0  140.0  180.0  04/01/2017      250      101
Red Inc     200.0  150.0  140.0    NaN  04/01/2017      250      101
Yellow Inc   90.0    NaN    NaN    NaN  04/01/2017      250      101
Orange Inc  210.0    NaN    NaN  250.0  04/01/2017      250      101
Blue Inc     90.0   50.0   95.0  101.0  04/01/2017      250      101
Alpha Co    210.0  200.0  215.0  250.0  04/01/2017      250      101

Example 9: Perform count and max operations on column 'accounts' grouped by 'accounts' and attach column to DataFrame

>>> df = df.groupby("accounts")
>>> window = df.accounts.window()
>>> df.assign(accounts_max=window.max(), accounts_count=window.count())
     accounts  accounts_count accounts_max
0   Jones LLC               6   Yellow Inc
1     Red Inc               6   Yellow Inc
2  Yellow Inc               6   Yellow Inc
3  Orange Inc               6   Yellow Inc
4    Blue Inc               6   Yellow Inc
5    Alpha Co               6   Yellow Inc