Teradata Package for Python Function Reference - regr_intercept - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2021-11-19

lifecycle

Product Category

Teradata Vantage

teradataml.dataframe.window.regr_intercept = regr_intercept(expression): DESCRIPTION: Function returns the intercept of the univariate linear regression line through all non-null data pairs of the dependent and independent variable arguments over the specified window. The intercept is the point at which the regression line through the non-null data pairs in the sample intersects the ordinate, or y-axis, of the graph. The plot of the linear regression on the variables is used to predict the behavior of the dependent variable from the change in the independent variable. There can be a strong nonlinear relationship between independent and dependent variables, and the computation of the simple linear regression between such variable pairs does not reflect such a relationship. The function considers ColumnExpression as an dependent variable and "expression" as an independent variable. PARAMETERS: expression: Required Argument. Specifies a ColumnExpression of a column or name of the column or a literal representing an independent variable for the regression. Types: ColumnExpression OR int OR float OR str RETURNS: * teradataml DataFrame - When aggregate is executed using window created on teradataml DataFrame. * ColumnExpression, also known as, teradataml DataFrameColumn - When aggregate is executed using window created on ColumnExpression. RAISES: RuntimeError - If column does not support the aggregate operation. EXAMPLES: # Load the data to run the example. >>> load_example_data("dataframe", "admissions_train") >>> # Create a DataFrame on 'admissions_train' table. >>> admissions_train = DataFrame("admissions_train") >>> admissions_train masters gpa stats programming admitted id 22 yes 3.46 Novice Beginner 0 36 no 3.00 Advanced Novice 0 15 yes 4.00 Advanced Advanced 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice Novice 0 17 no 3.83 Advanced Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 26 yes 3.57 Advanced Advanced 1 19 yes 1.98 Advanced Advanced 0 >>> # Note: # In the examples here, ColumnExpression is passed as input. User can # choose to pass column name instead of the ColumnExpression. # Example 1: Calculate the intercept of the univariate linear regression line # through all non-null data pairs of 'gpa' and 'admitted', # in a Rolling window, partitioned over 'programming'. # Create a Rolling window on 'gpa'. >>> window = admissions_train.gpa.window(partition_columns="programming", ... window_start_point=-2, ... window_end_point=0) >>> # Execute regr_intercept() on the Rolling window and attach it to the DataFrame. # Note: DataFrame.assign() allows combining multiple window aggregate # operations in one single call. In this example, we are executing # regr_intercept() along with count() window aggregate operations. >>> df = admissions_train.assign(regr_intercept_gpa=window.regr_intercept(admissions_train.admitted), ... count_gpa=window.count()) >>> df masters gpa stats programming admitted count_gpa regr_intercept_gpa id 15 yes 4.00 Advanced Advanced 1 3 NaN 16 no 3.70 Advanced Advanced 1 3 NaN 11 no 3.13 Advanced Advanced 1 3 NaN 9 no 3.82 Advanced Advanced 1 3 NaN 19 yes 1.98 Advanced Advanced 0 3 1.98 27 yes 3.96 Advanced Advanced 0 3 2.97 1 yes 3.95 Beginner Beginner 0 1 NaN 34 yes 3.85 Advanced Beginner 0 2 NaN 32 yes 3.46 Advanced Beginner 0 3 NaN 40 yes 3.95 Novice Beginner 0 3 NaN >>> # Example 2: Calculate the intercept of the univariate linear regression # line through all non-null data pairs for all the applicable # columns and independent variable as 'admitted', # in an Expanding window, partitioned over 'programming', # and order by 'id' in descending order. # Create an Expanding window on DataFrame. >>> window = admissions_train.window(partition_columns="masters", ... order_columns="id", ... sort_ascending=False, ... window_start_point=None, ... window_end_point=0) >>> # Execute regr_intercept() on the Expanding window. >>> df = window.regr_intercept(admissions_train.admitted) >>> df masters gpa stats programming admitted admitted_regr_intercept gpa_regr_intercept id_regr_intercept id 38 yes 2.65 Advanced Beginner 1 0.0 3.850000 39.50 32 yes 3.46 Advanced Beginner 0 0.0 3.752500 36.25 31 yes 3.50 Advanced Beginner 1 0.0 3.752500 36.25 30 yes 3.79 Advanced Novice 0 0.0 3.760000 35.00 27 yes 3.96 Advanced Advanced 0 0.0 3.822857 33.00 26 yes 3.57 Advanced Advanced 1 0.0 3.822857 33.00 37 no 3.52 Novice Novice 1 NaN NaN NaN 36 no 3.00 Advanced Novice 0 0.0 3.000000 36.00 35 no 3.68 Novice Beginner 1 0.0 3.000000 36.00 33 no 3.55 Novice Novice 1 0.0 3.000000 36.00 >>> # Example 3: Calculate the the intercept of the univariate linear regression # line independent variable as 'admitted' and through all non-null # data pairs as all the applicable columns, which are grouped by # 'masters', 'gpa' and 'id' in a Contracting window, partitioned over # 'masters' and order by 'masters' with nulls listed last. # Perform group_by() operation on teradataml DataFrame. >>> group_by_df = admissions_train.groupby(["masters", "gpa", "id"]) # Create a Contracting window on teradataml DataFrameGroupBy object. >>> window = group_by_df.window(partition_columns="masters", ... order_columns="masters", ... nulls_first=False, ... window_start_point=-5, ... window_end_point=None) # Execute regr_intercept() on the Contracting window. >>> window.regr_intercept(admissions_train.gpa) masters gpa id gpa_regr_intercept id_regr_intercept 0 yes 3.50 4 0.0 6.109916 1 yes 3.79 30 0.0 5.004477 2 yes 3.45 14 0.0 22.411673 3 yes 3.50 31 0.0 30.417318 4 yes 3.50 6 0.0 30.390058 5 yes 3.59 23 0.0 32.804411 6 no 3.65 12 0.0 24.834395 7 no 3.87 21 0.0 24.222350 8 no 3.44 5 0.0 24.850737 9 no 1.87 24 0.0 22.248982 >>>