Teradata Package for Python Function Reference on VantageCloud Lake - cume_dist - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference on VantageCloud Lake
- Deployment
- VantageCloud
- Edition
- Lake
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Lake_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.window.cume_dist = cume_dist()
- DESCRIPTION:
Function returns the cumulative distribution of values in a teradataml
DataFrame or ColumnExpression over the specified window.
Notes:
1. Window parameter "order_columns" should not be None.
2. Unlike other window aggregate functions, executing cume_dist()
on a window created on teradataml DataFrame does not create
multiple columns, that is, it results in one new column in addition
to the original teradataml DataFrame columns. For calculating
cumulative distribution, value passed to "order_columns" parameter
of window is considered and not the ColumnExpression.
PARAMETERS:
None.
RETURNS:
* teradataml DataFrame - When aggregate is executed using window created
on teradataml DataFrame.
* ColumnExpression, also known as, teradataml DataFrameColumn - When aggregate is
executed using window created on ColumnExpression.
RAISES:
RuntimeError - If column does not support the aggregate operation.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a teradataml DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Example 1: Calculate the cumulative distribution over the values in a
# window, partitioned over 'programming' and sort by 'id'.
# Create a window on 'gpa'.
>>> window = admissions_train.gpa.window(partition_columns=admissions_train.programming,
... order_columns=admissions_train.id)
# Execute cume_dist() on the window and attach it to the teradataml DataFrame.
# Note: DataFrame.assign() allows combining multiple window aggregate operations
# in one single call. In this example, we are executing cume_dist() along with
# max() window aggregate operations.
>>> df = admissions_train.assign(cume_dist=window.cume_dist(), max_gpa=window.max())
>>> df
masters gpa stats programming admitted cume_dist max_gpa
id
3 no 3.70 Novice Beginner 1 0.230769 4.0
22 yes 3.46 Novice Beginner 0 0.384615 4.0
29 yes 4.00 Novice Beginner 0 0.461538 4.0
31 yes 3.50 Advanced Beginner 1 0.538462 4.0
34 yes 3.85 Advanced Beginner 0 0.692308 4.0
35 no 3.68 Novice Beginner 1 0.769231 4.0
6 yes 3.50 Beginner Advanced 1 0.062500 4.0
8 no 3.60 Beginner Advanced 1 0.125000 4.0
9 no 3.82 Advanced Advanced 1 0.187500 4.0
10 no 3.71 Advanced Advanced 1 0.250000 4.0
>>>
# Example 2: Calculate the cumulative distribution on the window
# created on teradataml DataFrame, partitioned over 'masters',
# and order by 'id'.
# Create a window on teradataml DataFrame.
>>> window = admissions_train.window(partition_columns=admissions_train.masters,
... order_columns=admissions_train.id)
>>>
# Execute cume_dist() on window.
>>> df = window.cume_dist()
>>> df
masters gpa stats programming admitted col_cume_dist
id
4 yes 3.50 Beginner Novice 1 0.136364
7 yes 2.33 Novice Novice 1 0.227273
14 yes 3.45 Advanced Advanced 0 0.272727
15 yes 4.00 Advanced Advanced 1 0.318182
19 yes 1.98 Advanced Advanced 0 0.409091
20 yes 3.90 Advanced Advanced 1 0.454545
3 no 3.70 Novice Beginner 1 0.055556
5 no 3.44 Novice Novice 0 0.111111
8 no 3.60 Beginner Advanced 1 0.166667
9 no 3.82 Advanced Advanced 1 0.222222
>>>