ts.delta_t() | Teradata R Package - 17.00 - ts.delta_t() - Teradata R Package

Teradata® R Package User Guide

prodname
Teradata R Package
vrm_release
17.00
created_date
November 2020
category
User Guide
featnum
B700-4005-090K

The aggregate function ts.delta_t() calculates the time difference, or DELTA_T, between a starting and an ending event. The calculation is performed against a time-ordered time series data set.

This is the only Time Series Aggregate function that works with timebucket_duration as "*" in groupby_time(), that is, unbounded time.

When using groupby_time() with unbounded time, the following rules apply to the system virtual columns:
  • $TD_GROUP_BY_TIME: Always has a value of 1, since there is only one timebucket.
  • $TD_TIMECODE_RANGE: Composed of the first and last timecode values read for the group.
The data being evaluated in the filtering conditions (for example, the minimum and maximum temperature observation) must belong to the timecode value present in the same row of data. This is the expected behavior. However, this assumption can be violated when joining multiple tables together. It is possible to construct a query where the result of a join causes specific data points (for example, a temperature reading) to be present in a data row with a timecode that is not indicative of when that data point occurred. In such a scenario, it is highly likely that the results are not as expected, or are misleading. Vantage does not detect these types of queries, so you must make sure to preserve the correlation between data points and timecodes.
Key points about outcome of this function (resultant tdplyr tbl_teradata):
  • Function returns a column of PERIOD(TIMESTAMP WITH TIME ZONE) type (SQL Data type) composed of the start and end timecode, that is, timecode column used for aggregation of each start-end pair.
  • One result is returned per complete start-end pair found within the GROUP BY TIME window. The start-end pair process is as follows:
    • If the current source data meets the start condition, the current timecode is saved as the start time.
    • If the current source data meets the end condition, and a saved start timecode already exists, the start timecode is saved with the end timecode encountered as a result pair.
  • The processing algorithm implies that multiple results may be found in each group.
  • If no start-end pair is encountered, no result row is returned.
  • Any result of ts.delta_t() which has a delta less than 1 microsecond (including a delta of 0, in the case of a result which comes from a single point in time) is automatically rounded to 1 microsecond. This is strictly enforced to match Period data type semantics in Vantage which dictates that a starting and ending bound of a Period type may not be equivalent. The smallest granularity supported in Vantage is the microsecond, so these results are rounded accordingly.
  • ts.delta_t() must be used with group_by_time().
Arguments:
  • start.condition: Specifies any supported filtering condition that defines the start of the time period for which you are searching.
  • end.condition: Specifies any supported filtering condition that defines the end of the time period for which you are searching.

The following examples find the time elapsed between shipping and receiving of an item. Input data used for the examples contains information about parcels sent by a delivery service.

Example 1: Use tbl_teraadata on PTI table

This example showcases the usage of unbounded time in grouping.

  • Run the group_by_time() function, using unbounded time for timebucket.duration.
    > df_grp <- df_pack_pti %>% group_by_time(timebucket.duration = "*", value.expression = "parcel_number")
  • Run the 'ts.delta_t() function, with start and end conditions specified as strings.
    > df_out <- df_grp %>% summarise(delta_t = ts.delta_t("Status LIKE 'picked%up%customer'", "Status LIKE 'delivered%customer'"))
  • Print the results.
    > as.data.frame(df_out %>% arrange(TIMECODE_RANGE, parcel_number))
                                                         TIMECODE_RANGE GROUP BY TIME(*) parcel_number
    1 2012-01-01 00:00:00.000000+00:00,9999-12-31 23:59:59.999999+00:00                1            55
    2 2012-01-01 00:00:00.000000+00:00,9999-12-31 23:59:59.999999+00:00                1            75
                                                                delta_t
    1 2016-10-15 08:00:00.000000+00:00,2016-10-15 17:00:00.000000+00:00
    2 2016-10-15 08:00:00.000000+00:00,2016-10-15 17:00:00.000000+00:00

Example 2: Use tbl_teradata on Non-PTI Table

This example showcases the usage of unbounded time in grouping.

  • Run the group_by_time() function, using unbounded time for timebucket.duration.
    > df_grp <- df_pack_nonpti %>% group_by_time(timebucket.duration = "*", value.expression = "parcel_number", timecode.column = "clock_time")
  • Run the ts.delta_t() function, with start and end conditions specified as strings.
    > df_out <- df_grp %>% summarise(delta_t = ts.delta_t("Status LIKE 'picked%up%customer'", "Status LIKE 'delivered%customer'"))
  • Print the results.
    > as.data.frame(df_out %>% arrange(TIMECODE_RANGE, parcel_number))
                                                         TIMECODE_RANGE GROUP BY TIME(*) parcel_number
    1 1970-01-01 00:00:00.000000+00:00,9999-12-31 23:59:59.999999+00:00                1            55
    2 1970-01-01 00:00:00.000000+00:00,9999-12-31 23:59:59.999999+00:00                1            75
                                                                delta_t
    1 2016-10-15 08:00:00.000000+00:00,2016-10-15 17:00:00.000000+00:00
    2 2016-10-15 08:00:00.000000+00:00,2016-10-15 17:00:00.000000+00:00