Teradata Package for Python Function Reference - regr_sxx - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2021-11-19

lifecycle

Product Category

Teradata Vantage

teradataml.dataframe.window.regr_sxx = regr_sxx(expression): DESCRIPTION: Function returns the sum of the squares of the independent variable expression for all non-null data pairs of dependent and an independent variable arguments over the specified window. When function is executed, "expression" is treated as an independent variable and dependent variable is: * a ColumnExpression when invoked using a window created on ColumnExpression. * all columns of the teradataml DataFrame which are valid for this function, when executed on a window created on teradataml DataFrame. Note: When there are fewer than two non-null data point pairs in the data used for the computation, the function returns None. PARAMETERS: expression: Required Argument. Specifies a ColumnExpression of a column or name of the column or a literal representing an independent variable for the regression. An independent variable is a treatment: something that is varied under your control to test the behavior of another variable. Types: ColumnExpression OR int OR float OR str RETURNS: * teradataml DataFrame - When aggregate is executed using window created on teradataml DataFrame. * ColumnExpression, also known as, teradataml DataFrameColumn - When aggregate is executed using window created on ColumnExpression. RAISES: RuntimeError - If column does not support the aggregate operation. EXAMPLES: # Load the data to run the example. >>> load_example_data("dataframe", "admissions_train") >>> # Create a DataFrame on 'admissions_train' table. >>> admissions_train = DataFrame("admissions_train") >>> admissions_train masters gpa stats programming admitted id 22 yes 3.46 Novice Beginner 0 36 no 3.00 Advanced Novice 0 15 yes 4.00 Advanced Advanced 1 38 yes 2.65 Advanced Beginner 1 5 no 3.44 Novice Novice 0 17 no 3.83 Advanced Advanced 1 34 yes 3.85 Advanced Beginner 0 13 no 4.00 Advanced Novice 1 26 yes 3.57 Advanced Advanced 1 19 yes 1.98 Advanced Advanced 0 >>> # Note: # In the examples here, ColumnExpression is passed as input. User can # choose to pass column name instead of the ColumnExpression. # Example 1: Calculate the sum of the squares of column 'gpa' for all # non-null data pairs with dependent variable as 'admitted', # in a Rolling window, partitioned over 'programming'. # Create a Rolling window on 'gpa'. >>> window = admissions_train.admitted.window(partition_columns="programming", ... window_start_point=-2, ... window_end_point=0) >>> # Execute regr_sxx() on the Rolling window and attach it to the DataFrame. # Note: DataFrame.assign() allows combining multiple window aggregate # operations in one single call. In this example, we are executing # regr_sxx() along with count() window aggregate operations. >>> df = admissions_train.assign(regr_sxx_admitted_gpa=window.regr_sxx(admissions_train.gpa), ... count_gpa=window.count()) >>> df >>> df masters gpa stats programming admitted count_gpa regr_sxx_admitted_gpa id 15 yes 4.00 Advanced Advanced 1 3 8.006667e-02 16 no 3.70 Advanced Advanced 1 3 5.306667e-02 11 no 3.13 Advanced Advanced 1 3 3.604667e-01 9 no 3.82 Advanced Advanced 1 3 2.718000e-01 19 yes 1.98 Advanced Advanced 0 3 1.932800e+00 27 yes 3.96 Advanced Advanced 0 3 2.147467e+00 1 yes 3.95 Beginner Beginner 0 1 -4.796510e-16 34 yes 3.85 Advanced Beginner 0 2 5.000000e-03 32 yes 3.46 Advanced Beginner 0 3 1.340667e-01 40 yes 3.95 Novice Beginner 0 3 1.340667e-01 >>> # Example 2: Calculate the sum of the squares for all columns as # dependent variable and 'gpa' as independent variable, # in an Expanding window, partitioned over 'programming', # and order by 'id' in descending order. # Create an Expanding window on DataFrame. >>> window = admissions_train.window(partition_columns="masters", ... order_columns="id", ... sort_ascending=False, ... window_start_point=None, ... window_end_point=0) >>> # Execute regr_sxx() on the Expanding window. >>> df = window.regr_sxx(admissions_train.gpa) >>> df masters gpa stats programming admitted admitted_regr_sxx gpa_regr_sxx id_regr_sxx id 38 yes 2.65 Advanced Beginner 1 9.800000e-01 9.800000e-01 9.800000e-01 32 yes 3.46 Advanced Beginner 0 1.106480e+00 1.106480e+00 1.106480e+00 31 yes 3.50 Advanced Beginner 1 1.107333e+00 1.107333e+00 1.107333e+00 30 yes 3.79 Advanced Novice 0 1.166771e+00 1.166771e+00 1.166771e+00 27 yes 3.96 Advanced Advanced 0 1.436400e+00 1.436400e+00 1.436400e+00 26 yes 3.57 Advanced Advanced 1 1.443160e+00 1.443160e+00 1.443160e+00 37 no 3.52 Novice Novice 1 -4.891920e-16 -4.891920e-16 -4.891920e-16 36 no 3.00 Advanced Novice 0 1.352000e-01 1.352000e-01 1.352000e-01 35 no 3.68 Novice Beginner 1 2.528000e-01 2.528000e-01 2.528000e-01 33 no 3.55 Novice Novice 1 2.696750e-01 2.696750e-01 2.696750e-01 >>> # Example 3: Calculate the sum of the squares for all columns with independent # variable as 'gpa', which are grouped by 'masters' and 'gpa' in a # Contracting window, partitioned over 'masters' and order by 'masters' # with nulls listed last. # Perform group_by() operation on teradataml DataFrame. >>> group_by_df = admissions_train.groupby(["masters", "gpa"]) # Create a Contracting window on teradataml DataFrameGroupBy object. >>> window = group_by_df.window(partition_columns="masters", ... order_columns="masters", ... nulls_first=False, ... window_start_point=-5, ... window_end_point=None) # Execute regr_sxx() on the Contracting window. >>> window.regr_sxx(admissions_train.gpa) masters gpa gpa_regr_sxx 0 no 3.71 0.630000 1 no 3.52 0.733410 2 no 3.68 0.734400 3 no 3.83 0.796367 4 no 3.55 1.133143 5 no 3.96 1.133333 6 yes 3.59 1.743133 7 yes 3.95 3.680286 8 yes 3.46 3.976088 9 yes 3.76 3.986600 >>>