ts.mad() |Median Absolute Deviation |Teradata R Package - 17.00 - ts.mad() - Teradata R Package

Teradata® R Package User Guide

prodname
Teradata R Package
vrm_release
17.00
created_date
November 2020
category
User Guide
featnum
B700-4005-090K

The aggregate function ts.mad() returns the median of the set of values defined as the absolute value of the difference between each value and the median of all values in each group.

  • This function is valid only on columns with numeric types.
  • Nulls are not included in the result computation.
Arguments:
  • value.expression: Specifies the column from which top values are to be returned.
  • multiplier: Specifies a literal numeric constant.

    If this argument is NULL, default multiplier 1.4826 is used for MAD computation.

Formula for computing MAD is as follows:

MAD = b * Mi(|Xi - Mj(Xj)|)

Where,
  • b = Some numeric constant. Default value is 1.4826.
  • Mj(Xj) = Median of the original set of values.
  • Xi = The original set of values.
  • Mi = Median of absolute value of the difference between each value in Xi and the Median calculated in Mj(Xj).

Example 1: Calculate the MAD value of the 'temperature' column of sequenced PTI table

  • Calculate the MAD value.
    > df_seq_mad <- df_seq_grp %>% summarise(mad_temp = ts.mad(temperature))
  • Print the results.
    > df_seq_mad %>% arrange(TIMECODE_RANGE, buoyid, mad_temp)
    # Source:     lazy query [?? x 4]
    # Database:   [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2]
    #   [TDAPUSER@<hostname>/TDAPUSERDB]
    # Ordered by: TIMECODE_RANGE, buoyid, mad_temp
      TIMECODE_RANGE                                     `GROUP BY TIME(MINUTES(~ buoyid mad_temp
      <chr>                                              <int64>                   <int>    <dbl>
    1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:30~ 35345                         0    66.0
    2 2014-01-06 09:00:00.000000+00:00,2014-01-06 09:30~ 35347                         1     5.19
    3 2014-01-06 10:00:00.000000+00:00,2014-01-06 10:30~ 35349                        44     4.45
    4 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:00~ 35350                        22     0  
    5 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:00~ 35350                        44     0  
    6 2014-01-06 21:00:00.000000+00:00,2014-01-06 21:30~ 35371                         2     1.48

Example 2: Calculate the MAD value of the 'temperature' column of non-PTI table

  • Calculate the MAD value.
    > df_nonpti_mad <- df_nonpti_grp %>% summarise(mad_temp = ts.mad(temperature, 5))
  • Print the results.
    > df_nonpti_mad %>% arrange(TIMECODE_RANGE, mad_temp)
    # Source:     lazy query [?? x 3]
    # Database:   [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2]
    #   [TDAPUSER@<hostname>/TDAPUSERDB]
    # Ordered by: TIMECODE_RANGE, mad_temp
       TIMECODE_RANGE                                           `GROUP BY TIME(MINUTES(~ mad_temp
       <chr>                                                    <int64>                     <dbl>
     1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:01:00.00~ 23149921                        0
     2 2014-01-06 08:01:00.000000+00:00,2014-01-06 08:02:00.00~ 23149922                        0
     3 2014-01-06 08:02:00.000000+00:00,2014-01-06 08:03:00.00~ 23149923                        0
     4 2014-01-06 08:03:00.000000+00:00,2014-01-06 08:04:00.00~ 23149924                        0
     5 2014-01-06 08:04:00.000000+00:00,2014-01-06 08:05:00.00~ 23149925                        0
     6 2014-01-06 08:05:00.000000+00:00,2014-01-06 08:06:00.00~ 23149926                        0
     7 2014-01-06 08:06:00.000000+00:00,2014-01-06 08:07:00.00~ 23149927                        0
     8 2014-01-06 08:07:00.000000+00:00,2014-01-06 08:08:00.00~ 23149928                        0
     9 2014-01-06 08:08:00.000000+00:00,2014-01-06 08:09:00.00~ 23149929                       NA
    10 2014-01-06 08:09:00.000000+00:00,2014-01-06 08:10:00.00~ 23149930                        0
    # ... with more rows