Teradata Package for Python Function Reference | 20.00 - mad - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference - 20.00
- Deployment
- VantageCloud
- VantageCore
- Edition
- Enterprise
- IntelliFlex
- VMware
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Enterprise_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.sql.DataFrameColumn.mad = mad(self, constant_multiplier=None, **kwargs)
- DESCRIPTION:
Function returns the median of the set of values defined as the
absolute value of the difference between each value and the median
of all values in each group.
Formula for computing MAD is as follows:
MAD = b * Mi(|Xi - Mj(Xj)|)
Where,
b = Some numeric constant. Default value is 1.4826.
Mj(Xj) = Median of the original set of values.
Xi = The original set of values.
Mi = Median of absolute value of the difference between
each value in Xi and the Median calculated in Mj(Xj).
Note:
1. This function is valid only on columns with numeric types.
2. Null values are not included in the result computation.
3. This can only be used as Time Series Aggregate function.
PARAMETERS:
constant_multiplier:
Optional Argument.
Specifies a numeric values to be used as constant multiplier
(b in the above formula). It should be any numeric value
greater than or equal to 0.
Note:
When this argument is not used, Vantage uses 1.4826 as
constant multiplier.
Default Values: None
Types: int or float
kwargs:
Specifies optional keyword arguments.
RETURNS:
ColumnExpression
RAISES:
RuntimeError - If column does not support the aggregate operation.
EXAMPLES:
>>> # Load the example datasets.
... load_example_data("dataframe", ["ocean_buoys", "ocean_buoys_seq", "ocean_buoys_nonpti"])
# Example 1: Calculate Median Absolute Deviation for all columns over 1 calendar day of
# timebucket duration. Use default constant multiplier.
# No need to pass any arguments.
>>> # Create the required DataFrames.
... # DataFrame on non-sequenced PTI table
... ocean_buoys = DataFrame("ocean_buoys")
>>> ocean_buoys_grpby1 = ocean_buoys.groupby_time(timebucket_duration="1cd",value_expression="buoyid", fill="NULLS")
>>> ocean_buoys_grpby1.temperature.mad()
# Example 2: Calculate MAD values using 2 as constant multiplier for all the columns
# in ocean_buoys_seq DataFrame on sequenced PTI table.
>>> # DataFrame on sequenced PTI table
... ocean_buoys_seq = DataFrame("ocean_buoys_seq")
>>> ocean_buoys_seq_grpby1 = ocean_buoys_seq.groupby_time(timebucket_duration="CAL_DAYS(2)", value_expression="buoyid", fill="NULLS")
>>> constant_multiplier_columns = {2: "*"}
>>> ocean_buoys_seq_grpby1.temperature.mad(constant_multiplier_columns)