Teradata Package for Python Function Reference on VantageCloud Lake - regr_slope - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference on VantageCloud Lake
- Deployment
- VantageCloud
- Edition
- Lake
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Lake_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.window.regr_slope = regr_slope(expression)
- DESCRIPTION:
Function returns the slope of the univariate linear regression line through
all non-null data pairs of the dependent and an independent variable arguments
over the specified window. When function is executed, "expression" is treated as
an independent variable and dependent variable is:
* a ColumnExpression when invoked using a window created on ColumnExpression.
* all columns of the teradataml DataFrame which are valid for this function,
when executed on a window created on teradataml DataFrame.
Note:
When there are fewer than two non-null data point pairs in the
data used for the computation, the function returns None.
PARAMETERS:
expression:
Required Argument.
Specifies a ColumnExpression of a column or name of the column or a
literal representing an independent variable for the regression.
An independent variable is a treatment: something that is varied under
your control to test the behavior of another variable.
Types: ColumnExpression OR int OR float OR str
RETURNS:
* teradataml DataFrame - When aggregate is executed using window created
on teradataml DataFrame.
* ColumnExpression, also known as, teradataml DataFrameColumn - When aggregate is
executed using window created on ColumnExpression.
RAISES:
RuntimeError - If column does not support the aggregate operation.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Note:
# In the examples here, ColumnExpression is passed as input. User can
# choose to pass column name instead of the ColumnExpression.
# Example 1: Calculate the slope of the univariate linear regression
# line of the column 'gpa' for all
# non-null data pairs with dependent variable as 'admitted',
# in a Rolling window, partitioned over 'programming'.
# Create a Rolling window on 'gpa'.
>>> window = admissions_train.admitted.window(partition_columns="programming",
... window_start_point=-2,
... window_end_point=0)
>>>
# Execute regr_slope() on the Rolling window and attach it to the DataFrame.
# Note: DataFrame.assign() allows combining multiple window aggregate
# operations in one single call. In this example, we are executing
# regr_slope() along with count() window aggregate operations.
>>> df = admissions_train.assign(regr_avgx_admitted=window.regr_slope(admissions_train.gpa),
... count_gpa=window.count())
>>> df
masters gpa stats programming admitted count_gpa regr_avgx_admitted
id
11 no 3.13 Advanced Advanced 1 3 5.730659e-01
27 yes 3.96 Advanced Advanced 0 3 -1.093780e+00
26 yes 3.57 Advanced Advanced 1 3 -6.329114e-01
6 yes 3.50 Beginner Advanced 1 3 -2.306023e+00
9 no 3.82 Advanced Advanced 1 3 -3.314099e-14
25 no 3.96 Advanced Advanced 1 3 -5.393796e-14
39 yes 3.75 Advanced Beginner 0 1 NaN
31 yes 3.50 Advanced Beginner 1 2 -4.000000e+00
29 yes 4.00 Novice Beginner 0 3 -2.000000e+00
21 no 3.87 Novice Beginner 1 3 -1.560178e+00
>>>
# Example 2: Calculate the slope of the univariate linear regression line
# between all valid columns as dependent variable and 'gpa' as
# independent variable, in an Expanding window, partitioned over
# 'programming', and order by 'id' in descending order.
# Create an Expanding window on DataFrame.
>>> window = admissions_train.window(partition_columns=admissions_train.masters,
... order_columns=admissions_train.id.desc(),
... window_start_point=None,
... window_end_point=0)
>>>
# Execute regr_slope() on the Expanding window.
>>> df = window.regr_slope(admissions_train.gpa)
>>> df
masters gpa stats programming admitted admitted_regr_slope gpa_regr_slope id_regr_slope
id
38 yes 2.65 Advanced Beginner 1 -0.816327 1.0 1.326531
32 yes 3.46 Advanced Beginner 0 -0.797122 1.0 0.193406
31 yes 3.50 Advanced Beginner 1 -0.815774 1.0 0.328116
30 yes 3.79 Advanced Novice 0 -0.838700 1.0 -0.784827
27 yes 3.96 Advanced Advanced 0 -0.809895 1.0 -3.696742
26 yes 3.57 Advanced Advanced 1 -0.848139 1.0 -3.283073
37 no 3.52 Novice Novice 1 NaN NaN NaN
36 no 3.00 Advanced Novice 0 1.923077 1.0 1.923077
35 no 3.68 Novice Beginner 1 1.582278 1.0 -0.632911
33 no 3.55 Novice Novice 1 1.622323 1.0 -1.844813
>>>
# Example 3: Calculate the slope of the univariate linear regression line
# of the column 'gpa' for all non-null data pairs with independent
# variable as all other columns, which are grouped by 'masters'
# and 'gpa', 'admitted' in a Contracting window, partitioned over
# 'masters' and order by 'masters' with nulls listed last.
# Perform group_by() operation on teradataml DataFrame.
>>> group_by_df = admissions_train.groupby(["masters", "gpa", "admitted"])
# Create a Contracting window on teradataml DataFrameGroupBy object.
>>> window = group_by_df.window(partition_columns=group_by_df.masters,
... order_columns=group_by_df.masters.nulls_last(),
... window_start_point=-5,
... window_end_point=None)
# Execute regr_slope() on the Contracting window.
>>> window.regr_slope(admissions_train.gpa)
masters gpa admitted admitted_regr_slope gpa_regr_slope
0 yes 3.75 0 -0.432627 1.0
1 yes 2.65 1 0.125413 1.0
2 yes 3.50 1 0.056046 1.0
3 yes 3.76 0 0.095627 1.0
4 yes 3.90 1 0.039339 1.0
5 yes 1.98 0 -0.004912 1.0
6 no 3.44 0 0.239624 1.0
7 no 3.60 1 0.286300 1.0
8 no 3.52 1 0.317647 1.0
9 no 3.55 1 0.809892 1.0
>>>