Window on DataFrame | Teradata Package for Python - 17.00 - Window on DataFrame - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Published
November 2021
Last Update
2022-01-14
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)
Last Update
2022-01-14

Use the window() function to run window aggregate functions on the teradataml DataFrame.

The function allows user to specify the following window types for computations:
  • Cumulative
  • Group
  • Moving
  • Remaining
By default, window with Unbounded Preceding and Unbounded Following is considered for calculation.
If both partition_columns and order_columns are 'None' and the original DataFrame has BLOB and CLOB type of columns, then
  • Window aggregate operation on CLOB and BLOB type of columns is omitted;
  • Resultant DataFrame does not contain the BLOB and CLOB type of columns from the original DataFrame.

Example Setup

>>> from teradataml import *
>>> load_example_data("dataframe","sales")
>>> df = DataFrame.from_table('sales')

Example 1: Create a window on a teradataml DataFrame

>>> window = df.window()

Example 2: Create a cumulative (expanding) window

This example creates a cumulative (expanding) window with rows between unbounded preceding and 3 preceding, with partition_columns and order_columns arguments, and with default sorting.

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       window_start_point=None,
                       window_end_point=-3)

Example 3: Create a moving (rolling) window

This example creates a moving (rolling) window with rows between current row and 3 following, with sorting done on 'Feb', 'datetime' columns in descending order, and with partition_columns argument.

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       sort_ascending=False,
                       window_start_point=0,
                       window_end_point=3)

Example 4: Create a remaining (contracting) window

This example creates a remaining (contracting) window with rows between current row and unbounded following, with sorting done on 'Feb', 'datetime' columns in ascending order, and NULL values in 'Feb', 'datetime' columns appear at last.

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       nulls_first=False,
                       window_start_point=0,
                       window_end_point=None)

Example 5: Create a grouping window

This example creates a grouping window, with sorting done on 'Feb', 'datetime' columns in ascending order, and NULL values in 'Feb', 'datetime' columns appear at last.

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       sort_ascending=False,
                       nulls_first=False,
                       window_start_point=None,
                       window_end_point=None)

Example 6: Create a window, ignoring all parameters

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       sort_ascending=False,
                       nulls_first=False,
                       ignore_window=True)

Example 7: Perform sum on every valid column in DataFrame

>>> window = df.window(partition_columns="Feb",
                       order_columns=["Feb", "datetime"],
                       sort_ascending=False,
                       nulls_first=False,
                       ignore_window=True)
>>> window.sum()
              Feb    Jan    Mar    Apr    datetime  Apr_sum  Feb_sum  Jan_sum  Mar_sum
accounts
Jones LLC   200.0  150.0  140.0  180.0  04/01/2017      781   1000.0      550      590
Red Inc     200.0  150.0  140.0    NaN  04/01/2017      781   1000.0      550      590
Yellow Inc   90.0    NaN    NaN    NaN  04/01/2017      781   1000.0      550      590
Orange Inc  210.0    NaN    NaN  250.0  04/01/2017      781   1000.0      550      590
Blue Inc     90.0   50.0   95.0  101.0  04/01/2017      781   1000.0      550      590
Alpha Co    210.0  200.0  215.0  250.0  04/01/2017      781   1000.0      550      590

Example 8: Perform count on every valid column in DataFrame

>>> window = df.window()
>>> window.count()
              Feb    Jan    Mar    Apr    datetime  Apr_count  Feb_count  Jan_count  Mar_count  accounts_count  datetime_count
accounts
Jones LLC   200.0  150.0  140.0  180.0  04/01/2017          4          6          4          4               6               6
Red Inc     200.0  150.0  140.0    NaN  04/01/2017          4          6          4          4               6               6
Yellow Inc   90.0    NaN    NaN    NaN  04/01/2017          4          6          4          4               6               6
Orange Inc  210.0    NaN    NaN  250.0  04/01/2017          4          6          4          4               6               6
Blue Inc     90.0   50.0   95.0  101.0  04/01/2017          4          6          4          4               6               6
Alpha Co    210.0  200.0  215.0  250.0  04/01/2017          4          6          4          4               6               6

Example 9: Perform count of all valid columns in DataFrame, which is grouped by 'accounts'

>>> window = df.groupby("accounts").window()
>>> window.count()
      accounts  accounts_count
0   Jones LLC               6
1     Red Inc               6
2  Yellow Inc               6
3  Orange Inc               6
4    Blue Inc               6
5    Alpha Co               6