Teradata Package for Python Function Reference | 17.10 - window - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference
- Product
- Teradata Package for Python
- Release Number
- 17.10
- Published
- April 2022
- Language
- English (United States)
- Last Update
- 2022-08-19
- lifecycle
- previous
- Product Category
- Teradata Vantage
- teradataml.dataframe.sql.DataFrameColumn.window = window(self, partition_columns=None, order_columns=None, sort_ascending=True, nulls_first=None, window_start_point=None, window_end_point=None, ignore_window=False)
- DESCRIPTION:
This function generates Window object on a teradataml DataFrame Column to run
window aggregate functions.
Function allows user to specify window for different types of
computations:
* Cumulative
* Group
* Moving
* Remaining
By default, window with Unbounded Preceding and Unbounded following
is considered for calculation.
Note:
If both "partition_columns" and "order_columns" are None, then
Window cannot be created on CLOB and BLOB type of columns.
PARAMETERS:
partition_columns:
Optional Argument.
Specifies the name(s) of the column(s) over which the ordered
aggregate function executes by partitioning the rows. Such a
grouping is static.
Notes:
1. If this argument is not specified, then the entire data
from teradataml DataFrame, constitutes a single
partition, over which the ordered aggregate function
executes.
2. "partition_columns" does not support CLOB and BLOB type
of columns.
Refer 'DataFrame.tdtypes' to get the types of the
columns of a teradataml DataFrame.
3. "partition_columns" supports only columns specified in
groupby function, if Column is from DataFrameGroupBy.
Types: str OR list of Strings (str)
order_columns:
Optional Argument.
Specifies the name(s) of the column(s) to order the rows in a
partition, which determines the sort order of the rows over
which the function is applied.
Notes:
1. "order_columns" does not support CLOB and BLOB type of
columns.
Refer 'DataFrame.tdtypes' to get the types of the columns
of a teradataml DataFrame.
2. "order_columns" supports only columns specified in
groupby function, if Column is from DataFrameGroupBy.
Types: str OR list of Strings (str)
sort_ascending:
Optional Argument.
Specifies whether column ordering should be in ascending or
descending order.
Default Value: True (ascending)
Note:
When "order_columns" argument is not specified, argument
is ignored.
Types: bool
nulls_first:
Optional Argument.
Specifies whether null results are to be listed first or last
or scattered.
Default Value: None
Note:
When "order_columns" argument is not specified, argument
is ignored.
Types: bool
window_start_point:
Optional Argument.
Specifies a starting point for a window. Based on the integer
value, n, starting point of the window is decided.
* If 'n' is negative, window start point is n rows
preceding the current row/data point.
* If 'n' is positive, window start point is n rows
following the current row/data point.
* If 'n' is 0, window start at current row itself.
* If 'n' is None, window start as Unbounded preceding,
i.e., all rows before current row/data point are
considered.
Notes:
1. Value passed to this should always satisfy following condition:
window_start_point <= window_end_point
2. Following functions does not require any window to
perform window aggregation. So, "window_start_point" is
insignificant for below functions:
* cume_dist
* rank
* dense_rank
* percent_rank
* row_number
* lead
* lag
Default Value: None
Types: int
window_end_point:
Optional Argument.
Specifies an end point for a window. Based on the integer value,
n, starting point of the window is decided.
* If 'n' is negative, window end point is n rows preceding
the current row/data point.
* If 'n' is positive, window end point is n rows following
the current row/data point.
* If 'n' is 0, window end's at current row itself.
* If 'n' is None, window end's at Unbounded following,
i.e., all rows before current row/data point are
considered.
Notes:
1. Value passed to this should always satisfy following condition:
window_start_point <= window_end_point
2. Following functions does not require any window to
perform window aggregation. So, "window_end_point" is
insignificant for below functions:
* cume_dist
* rank
* dense_rank
* percent_rank
* row_number
* lead
* lag
Default Value: None
Types: int
ignore_window:
Optional Argument.
Specifies a flag to ignore parameters related to creating
window ("window_start_point", "window_end_point") and use other
arguments, if specified.
When set to True, window is ignored, i.e., ROWS clause is not
included.
When set to False, window will be created, which is specified
by "window_start_point" and "window_end_point" parameters.
Default Value: False
Types: bool
RAISES:
TypeError, ValueError
RETURNS:
An object of type Window.
EXAMPLES:
# Example 1: Create a window on a teradataml DataFrame column.
>>> load_example_data("dataframe","sales")
>>> df = DataFrame.from_table('sales')
>>> window = df.Feb.window()
>>>
# Example 2: Create a cumulative (expanding) window with rows
# between unbounded preceding and 3 preceding with
# "partition_columns" and "order_columns" argument with
# default sorting.
>>> window = df.Feb.window(partition_columns="Feb",
... order_columns=["Feb", "datetime"],
... window_start_point=None,
... window_end_point=-3)
>>>
# Example 3: Create a moving (rolling) window with rows between
# current row and 3 following with sorting done on 'Feb',
# 'datetime' columns in descending order and
# "partition_columns" argument.
>>> window = df.Feb.window(partition_columns="Feb",
... order_columns=["Feb", "datetime"],
... sort_ascending=False,
... window_start_point=0,
... window_end_point=3)
>>>
# Example 4: Create a remaining (contracting) window with rows
# between current row and unbounded following with
# sorting done on 'Feb', 'datetime' columns in ascending
# order and NULL values in 'Feb', 'datetime'
# columns appears at last.
>>> window = df.Feb.window(partition_columns="Feb",
... order_columns=["Feb", "datetime"],
... nulls_first=False,
... window_start_point=0,
... window_end_point=None
... )
>>>
# Example 5: Create a grouping window, with sorting done on 'Feb',
# 'datetime' columns in ascending order and NULL values
# in 'Feb', 'datetime' columns appears at last.
>>> window = df.Feb.window(partition_columns="Feb",
... order_columns=["Feb", "datetime"],
... sort_ascending=False,
... nulls_first=False,
... window_start_point=None,
... window_end_point=None
... )
>>>
# Example 6: Create a window on a teradataml DataFrame column, which
# ignores all the parameters while creating window.
>>> window = df.Feb.window(partition_columns="Feb",
... order_columns=["Feb", "datetime"],
... sort_ascending=False,
... nulls_first=False,
... ignore_window=True
... )
>>>
# Example 7: Perform sum of Feb and attach new column to the
# DataFrame.
>>> window = df.Feb.window()
>>> df.assign(feb_sum=window.sum())
Feb Jan Mar Apr datetime feb_sum
accounts
Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 1000.0
Red Inc 200.0 150.0 140.0 NaN 04/01/2017 1000.0
Yellow Inc 90.0 NaN NaN NaN 04/01/2017 1000.0
Orange Inc 210.0 NaN NaN 250.0 04/01/2017 1000.0
Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 1000.0
Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 1000.0
>>>
# Example 8: Perform min and max operations on column Apr and
# attach both the columns to the DataFrame.
>>> window = df.Apr.window()
>>> df.assign(apr_min=window.min(), apr_max=window.max())
Feb Jan Mar Apr datetime apr_max apr_min
accounts
Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 250 101
Red Inc 200.0 150.0 140.0 NaN 04/01/2017 250 101
Yellow Inc 90.0 NaN NaN NaN 04/01/2017 250 101
Orange Inc 210.0 NaN NaN 250.0 04/01/2017 250 101
Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 250 101
Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 250 101
>>>
# Example 9: Perform count and max operations on column accounts in
# teradataml DataFrame, which is grouped by 'accounts', and attach
# column to DataFrame.
>>> df = df.groupby("accounts")
>>> window = df.accounts.window()
>>> df.assign(accounts_max=window.max(), accounts_count=window.count())
accounts accounts_count accounts_max
0 Jones LLC 6 Yellow Inc
1 Red Inc 6 Yellow Inc
2 Yellow Inc 6 Yellow Inc
3 Orange Inc 6 Yellow Inc
4 Blue Inc 6 Yellow Inc
5 Alpha Co 6 Yellow Inc
>>>