| |
- regr_intercept(dependent_variable_expression, independent_variable_expression)
- DESCRIPTION:
Function returns the intercept of the univariate linear regression line through
all non-null data pairs of the dependent and independent variable arguments.
The intercept is the point at which the regression line through the non-null
data pairs in the sample intersects the ordinate, or y-axis, of the graph.
The plot of the linear regression on the variables is used to predict the behavior
of the dependent variable from the change in the independent variable.
There can be a strong nonlinear relationship between independent and dependent
variables, and the computation of the simple linear regression between such variable
pairs does not reflect such a relationship.
PARAMETERS:
dependent_variable_expression:
Required Argument.
Specifies a ColumnExpression of a column or a literal representing a
dependent variable for the regression.
A dependent variable is something that is measured in response to a treatment.
Format for the argument: '<dataframe>.<dataframe_column>.expression'.
independent_variable_expression:
Required Argument.
Specifies a ColumnExpression of a column or a literal representing an
independent variable for the regression.
An independent variable is a treatment: something that is varied under
your control to test the behavior of another variable.
Format for the argument: '<dataframe>.<dataframe_column>.expression'.
NOTE:
Function accepts positional arguments only.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Example 1: Calculate the intercept of the "gpa" column (independent variable) with
# "admitted" column (dependent variable).
# Import func from sqlalchemy to execute regr_intercept function.
>>> from sqlalchemy import func
# Create a sqlalchemy Function object.
>>> regr_intercept_func_ = func.regr_intercept(admissions_train.admitted.expression, admissions_train.gpa.expression)
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(True, regr_intercept_=regr_intercept_func_)
>>> print(df)
regr_intercept_
0 0.724144
>>>
# Example 2: Calculate the intercept of the "gpa" column (independent variable) with
# "admitted" column (dependent variable) for each
# level of programming.
# Note:
# When assign() is run after DataFrame.groupby(), the function ignores
# the "drop_columns" argument.
>>> admissions_train.groupby("programming").assign(regr_intercept_=regr_intercept_func_)
programming regr_intercept_
0 Beginner 2.566361
1 Advanced -0.626557
2 Novice 1.000091
>>>
|