Time Series Aggregate Functions | Teradata Package for Python - Time Series Aggregate Functions - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Published
November 2021
Language
English (United States)
Last Update
2022-01-14
dita:mapPath
bol1585763678431.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
B700-4006
lifecycle
previous
Product Category
Teradata Vantage

teradataml supports following set of time series aggregate functions which can be invoked on DataFrame.groupby_time() or DataFrame.resample().

See the teradataml: Time Series Functions section of Teradata Package for Python Function Reference, B700-4008) at https://docs.teradata.com/ for detailed description and usage examples of these functions.

Sr. No. Function Name Description
1 bottom() Returns the smallest number of values in the columns for each group, with or without ties.
2 count() Returns column-wise count for each group.
3 describe()

Generates statistics for numeric columns. It computes max, mean, min, std, median, mode, and percentiles for numeric columns.

Default statistics include: 'max', 'mean', 'min', 'std'.

4 delta_t() Calculates the time difference, or DELTA_T, between a starting and an ending event. The calculation is performed against a time-ordered time series data set.
5 first() Returns the oldest value, determined by the timecode, for each group.
6 kurtosis()

Returns column-wise kurtosis value for each group.

Kurtosis is the fourth moment of the distribution of the standardized (z) values.

It is a measure of the outlier (rare, extreme observation) character of the distribution as compared with the normal (or Gaussian) distribution.

  • The normal distribution has a kurtosis of 0.
  • Positive kurtosis indicates that the distribution is more outlier-prone than the normal distribution.
  • Negative kurtosis indicates that the distribution is less outlier-prone than the normal distribution.
7 last() Returns the newest value, determined by the timecode, for each group.
8 mad() Median Absolute Deviation (MAD) returns the median of the set of values defined as the absolute value of the difference between each value and the median of all values in each group.
9 max() Returns column-wise maximum value for each group.
10 mean() Returns column-wise mean value for each group.
11 median() Returns column-wise median value for each group.
12 min() Returns column-wise minimum value for each group.
13 mode() Returns the column-wise mode of all values in each group.
14 percentile()

Return the value which represents the desired percentile from each group.

The result value is determined by the desired index (di) in an ordered list of values. The following equation is for the di:

di = (number of values in group - 1) * percentile/100

When di is a whole number, that value is the returned result. The di can also be between two data points, i and j, where i<j. In that case, the result is interpolated according to the value specified in interpolation argument.

15 skew()

Returns column-wise skewness of the distribution for each group.

Skewness is the third moment of a distribution. It is a measure of the asymmetry of the distribution about its mean compared with the normal (or Gaussian) distribution.

  • The normal distribution has a skewness of 0.
  • Positive skewness indicates a distribution having an asymmetric tail extending toward more positive values.
  • Negative skewness indicates an asymmetric tail extending toward more negative values.
16 std()

Returns column-wise sample or population standard deviation value for each group. The standard deviation is the second moment of a distribution.

  • For a sample, it is a measure of dispersion from the mean of that sample.
  • For a population, it is a measure of dispersion from the mean of that population.

The computation is more conservative for the population standard deviation to minimize the effect of outliers on the computed value.

17 sum() Returns column-wise sum value for each group.
18 var()

Returns column-wise sample or population variance of the columns for each group.

  • The variance of a population is a measure of dispersion from the mean of that population.
  • The variance of a sample is a measure of dispersion from the mean of that sample. It is the square of the sample standard deviation.
19 top() Returns the largest number of values in the columns for each group, with or without ties.