describe() | Teradata Package for Python - describe() Method - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-12-05
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage
The describe() function generates statistics for numeric columns. This function can be used in two modes:
  • Regular Aggregate Mode

    It computes the count, mean, std, min, percentiles, and max for numeric columns.

    Default statistics include: "count", "mean", "std", "min", "percentile", "max".

    If describe() is used on the output of any DataFrame API or groupby(), then it is used in regular aggregate mode.
  • Time Series Aggregate Mode

    It computes the max, mean, min, std, median, mode, and percentiles for numeric columns.

    Default statistics include: 'max', 'mean', 'min', 'std'.

    If describe() is used on the output of groupby_time(), then it is used in time series aggregate mode, where time series aggregates are used to calculate the statistics.

Optional Arguments

percentiles
A list of values between 0 and 1 used for computing percentiles.

The default value is [.25, .5, .75], which generates the 25th, 50th, and 75th percentiles.

include
The values for this argument can be either 'None' or 'all', used to specify if non-numeric columns are included in the computation.
  • If the value is 'all': Both numeric and non-numeric columns are included. The function computes count, mean, std, min, percentiles, and max for numeric columns, and computes count and unique for non-numeric columns.
  • If the value is 'None': Only numeric columns are used for collecting statics.

The default value is 'None'.

Value 'all' is not applicable for Time Series Aggregate Mode.
verbose
Specifies a boolean value to be used for time series aggregation, stating whether to get verbose output or not. When this argument is set to 'True', function calculates median, mode, and percentile values on top of its default statistics.
Default and the only acceptable value for this argument when used in Regular Aggregate Mode is 'False'.

verbose as 'True' is not applicable for Regular Aggregate Mode.

distinct
Specifies a boolean value to decide whether to consider duplicate rows in statistic calculation or not.
By default, this argument is set to 'False', which means that duplicate values are considered for statistic calculation.

When this is set to 'True', only distinct rows are considered for statistic calculation.

statistics
Specifies the aggregate operation to be performed.
  • Computes count, mean, std, min, percentiles, and max for numeric columns.
  • Computes count and unique for non-numeric columns.
statistics is not applicable for 'Time Series Aggregate Mode'.

statistics should not be used with include as 'all'.

Permitted Values: count, mean, min, max, unique, std, describe, percentile

The default value is None.

Types: str or List of str

columns
Specifies the name(s) of the columns we are collecting statistics for.

The default value is None.

Types: str or List of str

pivot
Specifies a boolean value to pivot the output.
pivot is not supported for PTI tables.

The default value is 'False'.