| |
- kurtosis(value_expression)
- DESCRIPTION:
Function returns the kurtosis of the distribution of value_expression.
Kurtosis is the fourth moment of the distribution of the standardized (z) values.
It is a measure of the outlier (rare, extreme observation) character of the distribution as
compared with the normal (or Gaussian) distribution.
* The normal distribution has a kurtosis of 0.
* Positive kurtosis indicates that the distribution is more outlier-prone than the
normal distribution.
* Negative kurtosis indicates that the distribution is less outlier-prone than the
normal distribution.
PARAMETERS:
value_expression:
Required Argument.
Specifies a ColumnExpression of a numeric column for which the kurtosis of
the distribution of its values is to be computed.
Format for the argument: '<dataframe>.<dataframe_column>.expression'.
Notes:
1. Null values are not included in the result computation.
2. Following conditions will produce null result:
a. Fewer than three non-null data points in the data used for the computation.
b. Standard deviation for a column is equal to 0.
NOTE:
Function accepts positional arguments only.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Example 1: Calculate the kurtosis value for the "gpa" column.
# Import func from sqlalchemy to execute kurtosis function.
>>> from sqlalchemy import func
# Create a sqlalchemy Function object.
>>> kurtosis_func_ = func.kurtosis(admissions_train.gpa.expression)
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(True, kurtosis_gpa_=kurtosis_func_)
>>> print(df)
kurtosis_gpa_
0 4.052659
>>>
# Example 2: Calculate the kurtosis "gpa" for each level of programming.
# Note:
# When assign() is run after DataFrame.groupby(), the function ignores
# the "drop_columns" argument.
>>> admissions_train.groupby("programming").assign(kurtosis_gpa_=func.kurtosis(admissions_train.gpa.expression))
programming kurtosis_gpa_
0 Beginner 5.439392
1 Advanced 8.480554
2 Novice 1.420745
>>>
|