ts.mad() |Median Absolute Deviation | Teradata Package for R - ts.mad() - Teradata Package for R

Teradata® Package for R User Guide

Product
Teradata Package for R
Release Number
17.00
Published
July 2021
Language
English (United States)
Last Update
2023-08-08
dita:mapPath
yih1585763700215.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
B700-4005
Product Category
Teradata Vantage

The aggregate function ts.mad() returns the median of the set of values defined as the absolute value of the difference between each value and the median of all values in each group.

  • This function is valid only on columns with numeric types.
  • Nulls are not included in the result computation.
Arguments:
  • value.expression: Specifies the column from which top values are to be returned.
  • multiplier: Specifies a literal numeric constant.

    If this argument is NULL, default multiplier 1.4826 is used for MAD computation.

Formula for computing MAD is as follows:

MAD = b * Mi(|Xi - Mj(Xj)|)

Where,
  • b = Some numeric constant. Default value is 1.4826.
  • Mj(Xj) = Median of the original set of values.
  • Xi = The original set of values.
  • Mi = Median of absolute value of the difference between each value in Xi and the Median calculated in Mj(Xj).

Example 1: Calculate the MAD value of the 'temperature' column of sequenced PTI table

  • Calculate the MAD value.
    > df_seq_mad <- df_seq_grp %>% summarise(mad_temp = ts.mad(temperature))
  • Print the results.
    > df_seq_mad %>% arrange(TIMECODE_RANGE, buoyid, mad_temp)
    # Source:     lazy query [?? x 4]
    # Database:   [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2]
    #   [TDAPUSER@<hostname>/TDAPUSERDB]
    # Ordered by: TIMECODE_RANGE, buoyid, mad_temp
      TIMECODE_RANGE                                     `GROUP BY TIME(MINUTES(~ buoyid mad_temp
      <chr>                                              <int64>                   <int>    <dbl>
    1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:30~ 35345                         0    66.0
    2 2014-01-06 09:00:00.000000+00:00,2014-01-06 09:30~ 35347                         1     5.19
    3 2014-01-06 10:00:00.000000+00:00,2014-01-06 10:30~ 35349                        44     4.45
    4 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:00~ 35350                        22     0  
    5 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:00~ 35350                        44     0  
    6 2014-01-06 21:00:00.000000+00:00,2014-01-06 21:30~ 35371                         2     1.48

Example 2: Calculate the MAD value of the 'temperature' column of non-PTI table

  • Calculate the MAD value.
    > df_nonpti_mad <- df_nonpti_grp %>% summarise(mad_temp = ts.mad(temperature, 5))
  • Print the results.
    > df_nonpti_mad %>% arrange(TIMECODE_RANGE, mad_temp)
    # Source:     lazy query [?? x 3]
    # Database:   [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2]
    #   [TDAPUSER@<hostname>/TDAPUSERDB]
    # Ordered by: TIMECODE_RANGE, mad_temp
       TIMECODE_RANGE                                           `GROUP BY TIME(MINUTES(~ mad_temp
       <chr>                                                    <int64>                     <dbl>
     1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:01:00.00~ 23149921                        0
     2 2014-01-06 08:01:00.000000+00:00,2014-01-06 08:02:00.00~ 23149922                        0
     3 2014-01-06 08:02:00.000000+00:00,2014-01-06 08:03:00.00~ 23149923                        0
     4 2014-01-06 08:03:00.000000+00:00,2014-01-06 08:04:00.00~ 23149924                        0
     5 2014-01-06 08:04:00.000000+00:00,2014-01-06 08:05:00.00~ 23149925                        0
     6 2014-01-06 08:05:00.000000+00:00,2014-01-06 08:06:00.00~ 23149926                        0
     7 2014-01-06 08:06:00.000000+00:00,2014-01-06 08:07:00.00~ 23149927                        0
     8 2014-01-06 08:07:00.000000+00:00,2014-01-06 08:08:00.00~ 23149928                        0
     9 2014-01-06 08:08:00.000000+00:00,2014-01-06 08:09:00.00~ 23149929                       NA
    10 2014-01-06 08:09:00.000000+00:00,2014-01-06 08:10:00.00~ 23149930                        0
    # ... with more rows