Example 1: Generate statistics for DataFrame "sales"
The following example computes count, mean, std, min, percentiles, and max for numeric columns
>>> df = DataFrame('sales')
>>> df
Feb Jan Mar Apr datetime
accounts
Alpha Co 210.0 200 215 250 04/01/2017
Red Inc 200.0 150 140 None 04/01/2017
Orange Inc 210.0 None None 250 04/01/2017
Jones LLC 200.0 150 140 180 04/01/2017
Yellow Inc 90.0 None None None 04/01/2017
Blue Inc 90.0 50 95 101 04/01/2017
>>> df.describe(pivot=True)
Apr Feb Mar Jan
func
count 4 6 4 4
mean 195.25 166.667 147.5 137.5
std 70.971 59.554 49.749 62.915
min 101 90 95 50
25% 160.25 117.5 128.75 125
50% 215 200 140 150
75% 250 207.5 158.75 162.5
max 250 210 215 200
Example 2: Compute mean, min and max for numeric columns
>>> df.describe(pivot=True, statistics = ['mean', 'min', 'max'], columns= ['Jan', 'Feb', 'Mar', 'Apr']) func Feb Jan Mar Apr max 210.000 200.0 215.0 250.00 min 90.000 50.0 95.0 101.00 mean 166.667 137.5 147.5 195.25
Example 3: Use percentiles to compute the 30th and 60th percentiles
>>> df.describe(percentiles=[.3, .6], pivot=True)
Apr Feb Mar Jan
func
count 4 6 4 4
mean 195.25 166.667 147.5 137.5
std 70.971 59.554 49.749 62.915
min 101 90 95 50
30% 172.1 145 135.5 140
60% 236 200 140 150
max 250 210 215 200
Example 4: Group by to compute statistics for specific groups
>>> df1 = df.groupby(["datetime", "Feb"])
>>> df1.describe(pivot=True)
Jan Mar Apr
datetime Feb func
04/01/2017 90.0 25% 50 95 101
50% 50 95 101
75% 50 95 101
count 1 1 1
max 50 95 101
mean 50 95 101
min 50 95 101
std None None None
200.0 25% 150 140 180
50% 150 140 180
75% 150 140 180
count 2 2 1
max 150 140 180
mean 150 140 180
min 150 140 180
std 0 0 None
210.0 25% 200 215 250
50% 200 215 250
75% 200 215 250
count 1 1 2
max 200 215 250
mean 200 215 250
min 200 215 250
std None None 0
>>>
Example 5: Compute count, mean, std, min, percentiles, and max for numeric columns with default arguments and pivot set to False
>>> df.describe(pivot=False) ATTRIBUTE StatName StatValue
| ATTRIBUTE | StatName | StatValue |
|---|---|---|
| Jan | MAXIMUM | 200.0 |
| Jan | STANDARD DEVIATION | 62.91528696058958 |
| Jan | PERCENTILES(25) | 125.0 |
| Jan | PERCENTILES(50) | 150.0 |
| Mar | COUNT | 4.0 |
| Mar | MINIMUM | 95.0 |
| Mar | MAXIMUM | 215.0 |
| Mar | MEAN | 147.5 |
| Mar | STANDARD DEVIATION | 49.749371855331 |
| Mar | PERCENTILES(25) | 128.75 |
| Mar | PERCENTILES(50) | 140.0 |
| Apr | COUNT | 4.0 |
| Apr | MINIMUM | 101.0 |
| Apr | MAXIMUM | 250.0 |
| Apr | MEAN | 195.25 |
| Apr | STANDARD DEVIATION | 70.97123830585646 |
| Apr | PERCENTILES(25) | 160.25 |
| Apr | PERCENTILES(50) | 215.0 |
| Apr | PERCENTILES(75) | 250.0 |
| Feb | COUNT | 6.0 |
| Feb | MINIMUM | 90.0 |
| Feb | MAXIMUM | 210.0 |
| Feb | MEAN | 166.66666666666666 |
| Feb | STANDARD DEVIATION | 59.553897157672786 |
| Feb | PERCENTILES(25) | 117.5 |
| Feb | PERCENTILES(50) | 200.0 |
| Feb | PERCENTILES(75) | 207.5 |
| Mar | PERCENTILES(75) | 158.75 |
| Jan | PERCENTILES(75) | 162.5 |
| Jan | MEAN | 137.5 |
| Jan | MINIMUM | 50.0 |
| Jan | COUNT | 4.0 |