| |
- var_samp(value_expression)
- DESCRIPTION:
Function returns the sample variance for the data points in
value_expression.
The variance of a sample is a measure of dispersion from the mean of
that sample. It is the square of the sample standard deviation.
The computation is more conservative than that for the population
standard deviation to minimize the effect of outliers on the computed
value.
When the sample used for the computation has fewer than two non-null
data points, the function returns NULL.
PARAMETERS:
value_expression:
Required Argument.
Specifies a ColumnExpression of a numeric column for which sample variance
is to be computed.
Format for the argument: '<dataframe>.<dataframe_column>.expression'.
NOTE:
Function accepts positional arguments only.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Example 1: Calculate the sample variance for the values in "gpa" column.
# Import func from sqlalchemy to execute var_samp function.
>>> from sqlalchemy import func
# Create a sqlalchemy Function object.
>>> var_samp_func_ = func.var_samp(admissions_train.gpa.expression)
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(True, var_samp_gpa_=var_samp_func_)
>>> print(df)
var_samp_gpa_
0 0.263953
>>>
# Example 2: Calculate the sample variance for the values in "gpa" column
# for each level of programming.
# Note:
# When assign() is run after DataFrame.groupby(), the function ignores
# the "drop_columns" argument.
>>> admissions_train.groupby("programming").assign(var_samp_gpa_=func.var_samp(admissions_train.gpa.expression))
programming var_samp_gpa_
0 Advanced 0.244026
1 Novice 0.418267
2 Beginner 0.125817
>>>
|