Teradata Package for Python Function Reference | 20.00 - regr_r2 - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference - 20.00
- Deployment
- VantageCloud
- VantageCore
- Edition
- Enterprise
- IntelliFlex
- VMware
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Enterprise_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.window.regr_r2 = regr_r2(expression)
- DESCRIPTION:
Function returns the coefficient of determination for all non-null data
pairs of the dependent and independent variable arguments over the
specified window. The function considers ColumnExpression as a dependent
variable and "expression" as an independent variable.
Note:
When there are fewer than two non-null data point pairs in the data
used for the computation, the function returns NULL.
PARAMETERS:
expression:
Required Argument.
Specifies a ColumnExpression of a column or name of the column or a
literal representing an independent variable for the regression.
Types: ColumnExpression OR int OR float OR str
RETURNS:
* teradataml DataFrame - When aggregate is executed using window created
on teradataml DataFrame.
* ColumnExpression, also known as, teradataml DataFrameColumn - When aggregate is
executed using window created on ColumnExpression.
RAISES:
RuntimeError - If column does not support the aggregate operation.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Note:
# In the examples here, ColumnExpression is passed as input. User can
# choose to pass column name instead of the ColumnExpression.
# Example 1: Calculate the coefficient of determination for the values
# forming a pair between 'gpa' and 'admitted', in a Rolling
# window, partitioned over 'programming'.
# Create a Rolling window on 'gpa'.
>>> window = admissions_train.gpa.window(partition_columns="programming",
... window_start_point=-2,
... window_end_point=0)
>>>
# Execute regr_r2() on the Rolling window and attach it to the DataFrame.
# Note: DataFrame.assign() allows combining multiple window aggregate
# operations in one single call. In this example, we are executing
# regr_r2() along with count() window aggregate operations.
>>> df = admissions_train.assign(regr_r2_admitted=window.regr_r2(admissions_train.admitted),
... count_gpa=window.count())
>>> df
masters gpa stats programming admitted count_gpa regr_r2_admitted
id
11 no 3.13 Advanced Advanced 1 3 1.980
27 yes 3.96 Advanced Advanced 0 3 3.705
26 yes 3.57 Advanced Advanced 1 3 3.705
6 yes 3.50 Beginner Advanced 1 3 3.960
9 no 3.82 Advanced Advanced 1 3 NaN
25 no 3.96 Advanced Advanced 1 3 NaN
39 yes 3.75 Advanced Beginner 0 1 NaN
31 yes 3.50 Advanced Beginner 1 2 3.750
29 yes 4.00 Novice Beginner 0 3 3.875
21 no 3.87 Novice Beginner 1 3 4.000
>>>
# Example 2: Calculate the coefficient of determination between
# 'admitted' and all valid columns, in an Expanding window,
# partitioned over 'programming', and order by 'id' in descending
# order.
# Create an Expanding window on DataFrame.
>>> window = admissions_train.window(partition_columns=admissions_train.masters,
... order_columns=admissions_train.id.desc(),
... window_start_point=None,
... window_end_point=0)
>>>
# Execute regr_r2() on the Expanding window.
>>> df = window.regr_r2(admissions_train.admitted)
>>> df
masters gpa stats programming admitted admitted_regr_r2 gpa_regr_r2 id_regr_r2
id
38 yes 2.65 Advanced Beginner 1 1.0 0.979592 0.750000
32 yes 3.46 Advanced Beginner 0 1.0 0.878827 0.051907
31 yes 3.50 Advanced Beginner 1 1.0 0.552687 0.055682
30 yes 3.79 Advanced Novice 0 1.0 0.574510 0.003541
27 yes 3.96 Advanced Advanced 0 1.0 0.605686 0.019886
26 yes 3.57 Advanced Advanced 1 1.0 0.494344 0.016637
37 no 3.52 Novice Novice 1 NaN NaN NaN
36 no 3.00 Advanced Novice 0 1.0 1.000000 1.000000
35 no 3.68 Novice Beginner 1 1.0 0.949367 0.000000
33 no 3.55 Novice Novice 1 1.0 0.946355 0.085714
>>>
# Example 3: Calculate the coefficient of determination between
# 'gpa' and all valid columns, which are grouped by
# 'masters' and 'gpa' in a Contracting window, partitioned
# over 'masters' and order by 'masters' with nulls listed last.
# Perform group_by() operation on teradataml DataFrame.
>>> group_by_df = admissions_train.groupby(["masters", "gpa"])
# Create a Contracting window on teradataml DataFrameGroupBy object.
>>> window = group_by_df.window(partition_columns=group_by_df.masters,
... order_columns=group_by_df.masters.nulls_last(),
... window_start_point=-5,
... window_end_point=None)
# Execute regr_r2() on the Contracting window.
>>> window.regr_r2(admissions_train.gpa)
masters gpa gpa_regr_r2
0 yes 3.76 1.0
1 yes 3.81 1.0
2 yes 1.98 1.0
3 yes 3.85 1.0
4 yes 3.75 1.0
5 yes 3.95 1.0
6 no 3.87 1.0
7 no 3.60 1.0
8 no 3.13 1.0
9 no 3.52 1.0
>>>