| |
- stddev_samp(value_expression)
- DESCRIPTION:
Function returns the sample standard deviation for the non-null data
points in value_expression.
The standard deviation is the second moment of a distribution. For a sample,
it is a measure of dispersion from the mean of that sample. The computation
is more conservative for the population standard deviation to minimize the
effect of outliers on the computed value.
When there are fewer than two non-null data points in the sample used for the
computation, the function returns NULL.
PARAMETERS:
value_expression:
Required Argument.
Specifies a ColumnExpression of a numeric column for which sample
standard deviation is to be computed.
Format for the argument: '<dataframe>.<dataframe_column>.expression'.
NOTE:
Function accepts positional arguments only.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Example 1: Calculate the sample standard deviation for the values in
# "gpa" column.
# Import func from sqlalchemy to execute stddev_samp function.
>>> from sqlalchemy import func
# Create a sqlalchemy Function object.
>>> stddev_samp_func_ = func.stddev_samp(admissions_train.gpa.expression)
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(True, stddev_samp_gpa_=stddev_samp_func_)
>>> print(df)
stddev_samp_gpa_
0 0.513764
>>>
# Example 2: Calculate the sample standard deviation for the values in
# "gpa" column for each level of programming.
# Note:
# When assign() is run after DataFrame.groupby(), the function ignores
# the "drop_columns" argument.
>>> admissions_train.groupby("programming").assign(stddev_samp_gpa_=func.stddev_samp(admissions_train.gpa.expression))
programming stddev_samp_gpa_
0 Advanced 0.493990
1 Novice 0.646736
2 Beginner 0.354706
>>>
|