ts.sd() | Teradata Package for R - ts.sd() - Teradata Package for R

Teradata® Package for R User Guide

Product
Teradata Package for R
Release Number
17.00
Published
July 2021
Language
English (United States)
Last Update
2023-08-08
dita:mapPath
yih1585763700215.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
B700-4005
Product Category
Teradata Vantage

The aggregate function ts.sd() returns the sample standard deviation of values of the column grouped by time.

The standard deviation is the second moment of a distribution.

  • When there are fewer than two non-NULL data points in the sample used for the computation, ts.sd returns NULL/NA.
  • Nulls are not included in the result computation.
  • If data represents only a sample of the entire population for the column, Teradata recommends to use ts.sd() to calculate sample standard deviation instead of ts.sdp() which calculates population standard deviation. As the sample size increases, the values for ts.sd() and ts.sdp() approach the same number.
Arguments:
  • value.expression: Specify the column for which sample standard deviation is to be computed.

Use ts.sd(distinct(column_name)) to exclude duplicate rows while calculating sample standard deviation.

Example 1: Calculate the sample standard deviation of the 'temperature' column of sequenced PTI table

  • Calculate the sample standard deviation.
    > df_seq_sd <- df_seq_grp %>% summarise(sd_temp = ts.sd(temperature))
  • Print the results.
    > df_seq_sd %>% arrange(TIMECODE_RANGE, buoyid, sd_temp)
    # Source:     lazy query [?? x 4]
    # Database:   [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2]
    #   [TDAPUSER@<hostname>/TDAPUSERDB]
    # Ordered by: TIMECODE_RANGE, buoyid, sd_temp
      TIMECODE_RANGE                                      `GROUP BY TIME(MINUTES(~ buoyid sd_temp
      <chr>                                               <int64>                   <int>   <dbl>
    1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:30:~ 35345                         0   51.7
    2 2014-01-06 09:00:00.000000+00:00,2014-01-06 09:30:~ 35347                         1    3.94
    3 2014-01-06 10:00:00.000000+00:00,2014-01-06 10:30:~ 35349                        44    5.81
    4 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:00:~ 35350                        22   NA  
    5 2014-01-06 10:30:00.000000+00:00,2014-01-06 11:00:~ 35350                        44    0  
    6 2014-01-06 21:00:00.000000+00:00,2014-01-06 21:30:~ 35371                         2    1  

Example 2: Calculate the sample standard deviation of the 'temperature' column of non-PTI table

  • Calculate the sample standard deviation.
    > df_nonpti_sd <- df_nonpti %>% group_by_time(timebucket.duration = "10m", timecode.column = "TIMECODE") %>% summarise(sd_temp = ts.sd(temperature))
  • Print the results.
    > df_nonpti_sd %>% arrange(TIMECODE_RANGE, sd_temp)
    # Source:     lazy query [?? x 3]
    # Database:   [Teradata 16.20.50.01] [Teradata Native Driver 17.0.0.2]
    #   [TDAPUSER@<hostname>/TDAPUSERDB]
    # Ordered by: TIMECODE_RANGE, sd_temp
      TIMECODE_RANGE                                            `GROUP BY TIME(MINUTES(1~ sd_temp
      <chr>                                                     <int64>                     <dbl>
    1 2014-01-06 08:00:00.000000+00:00,2014-01-06 08:10:00.000~ 2314993                     62.9
    2 2014-01-06 08:10:00.000000+00:00,2014-01-06 08:20:00.000~ 2314994                     63.6
    3 2014-01-06 09:00:00.000000+00:00,2014-01-06 09:10:00.000~ 2314999                      3.94
    4 2014-01-06 10:00:00.000000+00:00,2014-01-06 10:10:00.000~ 2315005                      5.76
    5 2014-01-06 10:10:00.000000+00:00,2014-01-06 10:20:00.000~ 2315006                     NA  
    6 2014-01-06 10:30:00.000000+00:00,2014-01-06 10:40:00.000~ 2315008                     NA  
    7 2014-01-06 10:50:00.000000+00:00,2014-01-06 11:00:00.000~ 2315010                     NA  
    8 2014-01-06 21:00:00.000000+00:00,2014-01-06 21:10:00.000~ 2315071                      1