Teradata Package for Python Function Reference | 20.00 - window - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference - 20.00
- Deployment
- VantageCloud
- VantageCore
- Edition
- Enterprise
- IntelliFlex
- VMware
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Enterprise_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.dataframe.DataFrame.window = window(self, partition_columns=None, order_columns=None, sort_ascending=True, nulls_first=None, window_start_point=None, window_end_point=None, ignore_window=False)
- DESCRIPTION:
This function generates Window object on a teradataml DataFrame to run
window aggregate functions.
Function allows user to specify window for different types of
computations:
* Cumulative
* Group
* Moving
* Remaining
By default, window with Unbounded Preceding and Unbounded Following
is considered for calculation.
Note:
If both "partition_columns" and "order_columns" are None and
original DataFrame has BLOB and CLOB type of columns, then
* Window aggregate operation on CLOB and BLOB type of
columns is omitted.
* Resultant DataFrame does not contain the BLOB and CLOB
type of columns from original DataFrame.
PARAMETERS:
partition_columns:
Optional Argument.
Specifies the name(s) of the column(s) over which the ordered
aggregate function executes by partitioning the rows. Such a
grouping is static.
Notes:
1. If this argument is not specified, then the entire data
from teradataml DataFrame, constitutes a single
partition, over which the ordered aggregate function
executes.
2. "partition_columns" does not support CLOB and BLOB type
of columns.
Refer 'DataFrame.tdtypes' to get the types of the
columns of a teradataml DataFrame.
3. "partition_columns" supports only columns specified in
groupby function, if window is initiated on DataFrameGroupBy.
Types: str OR list of Strings (str) OR ColumnExpression OR list of ColumnExpressions
order_columns:
Optional Argument.
Specifies the name(s) of the column(s) to order the rows in a
partition, which determines the sort order of the rows over
which the function is applied.
Notes:
1. "order_columns" does not support CLOB and BLOB type
of columns.
Refer 'DataFrame.tdtypes' to get the types of the
columns of a teradataml DataFrame.
2. "order_columns" supports only columns specified in
groupby, if window is initiated on DataFrameGroupBy.
3. When ColumnExpression(s) is(are) passed to "order_columns", then the
corresponding expression takes precedence over arguments
"sort_ascending" and "nulls_first". Say, ColumnExpression is col1, then
1. col1.asc() or col.desc() is effective irrespective of "sort_ascending".
2. col1.nulls_first() or col.nulls_last() is effective irrespective of "nulls_first".
3. Any combination of above two take precedence over "sort_ascending" and "nulls_first".
Types: str OR list of Strings (str) OR ColumnExpression OR list of ColumnExpressions
sort_ascending:
Optional Argument.
Specifies whether column ordering should be in ascending or
descending order.
Default Value: True (ascending)
Notes:
* When "order_columns" argument is not specified, this argument
is ignored.
* When ColumnExpression(s) is(are) passed to "order_columns", then the
argument is ignored.
Types: bool
nulls_first:
Optional Argument.
Specifies whether null results are to be listed first or last
or scattered.
Default Value: None
Notes:
* When "order_columns" argument is not specified, this argument
is ignored.
* When "order_columns" is a ColumnExpression(s), this argument
is ignored.
Types: bool
window_start_point:
Optional Argument.
Specifies a starting point for a window. Based on the integer
value, n, starting point of the window is decided.
* If 'n' is negative, window start point is n rows
preceding the current row/data point.
* If 'n' is positive, window start point is n rows
following the current row/data point.
* If 'n' is 0, window start at current row itself.
* If 'n' is None, window start as Unbounded preceding,
i.e., all rows before current row/data point are
considered.
Notes:
1. Value passed to this should always satisfy following condition:
window_start_point <= window_end_point
2. Following functions does not require any window to
perform window aggregation. So, "window_start_point" is
insignificant for below functions:
* cume_dist
* rank
* dense_rank
* percent_rank
* row_number
* lead
* lag
Default Value: None
Types: int
window_end_point:
Optional Argument.
Specifies an end point for a window. Based on the integer value,
n, starting point of the window is decided.
* If 'n' is negative, window end point is n rows preceding
the current row/data point.
* If 'n' is positive, window end point is n rows following
the current row/data point.
* If 'n' is 0, window end's at current row itself.
* If 'n' is None, window end's at Unbounded Following,
i.e., all rows before current row/data point are
considered.
Notes:
1. Value passed to this should always satisfy following condition:
window_start_point <= window_end_point
2. Following functions does not require any window to
perform window aggregation. So, "window_end_point" is
insignificant for below functions:
* cume_dist
* rank
* dense_rank
* percent_rank
* row_number
* lead
* lag
Default Value: None
Types: int
ignore_window:
Optional Argument.
Specifies a flag to ignore parameters related to creating
window ("window_start_point", "window_end_point") and use other
arguments, if specified.
When set to True, window is ignored, i.e., ROWS clause is not
included.
When set to False, window will be created, which is specified
by "window_start_point" and "window_end_point" parameters.
Default Value: False
Types: bool
RAISES:
TypeError, ValueError
RETURNS:
An object of type Window.
EXAMPLES:
# Example 1: Create a window on a teradataml DataFrame.
>>> load_example_data("dataframe","sales")
>>> df = DataFrame.from_table('sales')
>>> window = df.window()
>>>
# Example 2: Create a cumulative (expanding) window with rows
# between unbounded preceding and 3 preceding with
# "partition_columns" and "order_columns" argument with
# default sorting.
>>> window = df.window(partition_columns=df.Feb,
... order_columns=[df.Feb, "datetime"],
... window_start_point=None,
... window_end_point=-3)
>>>
# Example 3: Create a moving (rolling) window with rows between
# current row and 3 following with sorting done on 'Feb'
# in ascending order, datetime' columns in descending order
# and "partition_columns" argument.
>>> window = df.window(partition_columns=df.Feb,
... order_columns=[df.Feb.asc(), df.datetime.desc()],
... window_start_point=0,
... window_end_point=3)
>>>
# Example 4: Create a remaining (contracting) window with rows
# between current row and unbounded following with
# sorting done on 'Feb', 'datetime' columns in ascending
# order and NULL values in 'Feb', 'datetime'
# columns appears at last.
>>> window = df.window(partition_columns=df.Feb,
... order_columns=[df.Feb.nulls_first(), df.datetime.nulls_first()],
... window_start_point=0,
... window_end_point=None)
>>>
# Example 5: Create a grouping window, with sorting done on 'Feb',
# 'datetime' columns in ascending order with NULL values
# in 'Feb' column appears at first and 'datetime' column
# appears at last.
>>> window = df.window(partition_columns="Feb",
... order_columns=[df.Feb.nulls_first(), df.datetime.nulls_last()],
... window_start_point=None,
... window_end_point=None)
>>>
# Example 6: Create a window on a teradataml DataFrame, which
# ignores all the parameters while creating window.
>>> window = df.window(partition_columns=df.Feb,
... order_columns=[df.Feb.desc().nulls_last(), df.datetime.desc().nulls_last()]
... ignore_window=True)
>>>
# Example 7: Perform sum on every valid column in DataFrame.
>>> window = df.window(partition_columns="Feb",
... order_columns=["Feb", "datetime"],
... sort_ascending=False,
... nulls_first=False,
... ignore_window=True)
>>> window.sum()
Feb Jan Mar Apr datetime Apr_sum Feb_sum Jan_sum Mar_sum
accounts
Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 781 1000.0 550 590
Red Inc 200.0 150.0 140.0 NaN 04/01/2017 781 1000.0 550 590
Yellow Inc 90.0 NaN NaN NaN 04/01/2017 781 1000.0 550 590
Orange Inc 210.0 NaN NaN 250.0 04/01/2017 781 1000.0 550 590
Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 781 1000.0 550 590
Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 781 1000.0 550 590
# Example 8: Perform count on every valid column in DataFrame.
>>> window = df.window()
>>> window.count()
Feb Jan Mar Apr datetime Apr_count Feb_count Jan_count Mar_count accounts_count datetime_count
accounts
Jones LLC 200.0 150.0 140.0 180.0 04/01/2017 4 6 4 4 6 6
Red Inc 200.0 150.0 140.0 NaN 04/01/2017 4 6 4 4 6 6
Yellow Inc 90.0 NaN NaN NaN 04/01/2017 4 6 4 4 6 6
Orange Inc 210.0 NaN NaN 250.0 04/01/2017 4 6 4 4 6 6
Blue Inc 90.0 50.0 95.0 101.0 04/01/2017 4 6 4 4 6 6
Alpha Co 210.0 200.0 215.0 250.0 04/01/2017 4 6 4 4 6 6
>>>
# Example 9: Perform count of all the valid columns in teradataml
# DataFrame, which is grouped by 'accounts'.
>>> window = df.groupby("accounts").window()
>>> window.count()
accounts accounts_count
0 Jones LLC 6
1 Red Inc 6
2 Yellow Inc 6
3 Orange Inc 6
4 Blue Inc 6
5 Alpha Co 6
>>>