Teradata Package for R Function Reference | 17.00 - DTW - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Product
Teradata Package for R
Release Number
17.00
Published
July 2021
Language
English (United States)
Last Update
2023-08-08
dita:id
B700-4007
NMT
no
Product Category
Teradata Vantage
DTW

Description

The DTW function performs dynamic time warping (DTW), which measures the similarity (warp distance) between two time series that vary in time or speed. You can use DTW to analyze any data that can be represented linearly, for example, video, audio, and graphics.

Usage

  td_dtw_mle (
      data = NULL,
      template.data = NULL,
      mapping.data = NULL,
      input.columns = NULL,
      template.columns = NULL,
      timeseries.id = NULL,
      template.id = NULL,
      radius = 10,
      dist.method = "EuclideanDistance",
      warp.path = FALSE,
      data.sequence.column = NULL,
      template.data.sequence.column = NULL,
      mapping.data.sequence.column = NULL,
      data.partition.column =NULL,
      mapping.data.partition.column = NULL,
      data.order.column = NULL,
      template.data.order.column = NULL,
      mapping.data.order.column = NULL
  )

Arguments

data

Required Argument.
Specifies the tbl_teradata object which contains the time series information.

data.partition.column

Required Argument.
Specifies Partition By columns for argument "data".
Values to this argument can be provided as vector, if multiple columns are used for partition.
Types: character OR vector of Strings (character)

data.order.column

Required Argument.
Specifies Order By columns for argument "data".
Values to this argument can be provided as vector, if multiple columns are used for ordering.
Types: character OR vector of Strings (character)

template.data

Required Argument.
Specifies the tbl_teradata object which contains the template information.

template.data.order.column

Required Argument.
Specifies Order By columns for argument "template.data".
Values to this argument can be provided as vector, if multiple columns are used for ordering.
Types: character OR vector of Strings (character)

mapping.data

Required Argument.
Specifies the tbl_teradata object which contains the mapping between the rows in the tbl_teradata specified in "data" argument and the rows in the tbl_teradata specified in "template.data" argument.

mapping.data.partition.column

Required Argument.
Specifies Partition By columns for argument "mapping.data".
Values to this argument can be provided as vector, if multiple columns are used for partition.
Types: character OR vector of Strings (character)

mapping.data.order.column

Optional Argument.
Specifies Order By columns for argument "mapping.data".
Values to this argument can be provided as vector, if multiple columns are used for ordering.
Types: character OR vector of Strings (character)

input.columns

Required Argument.
Specifies the names of the columns that contain the values and timestamps of the time series.
Note: If these columns contain NaN or infinity values then, they should be removed.
Types: character OR vector of Strings (character)

template.columns

Required Argument.
Specifies the names of the columns from "template.data" object that contain the values and timestamps of the time series.
Note: If these columns contain NaN or infinity values then they should be removed.
Types: character OR vector of Strings (character)

timeseries.id

Required Argument.
Specifies the names of the columns by which the tbl_teradata specified in "data" argument is partitioned. These columns comprise the unique id for a time series specified in argument "data".
Types: character

template.id

Required Argument.
Specifies the names of the columns by which the tbl_teradata specified in "template.data" argument is ordered. These columns comprise the unique id for a time series in "template.data".
Types: character

radius

Optional Argument.
Specifies the integer value that determines the projected warp path from a previous resolution.
Default Value: 10
Types: integer

dist.method

Optional Argument.
Specifies the metric for computing the warping distance. The supported case-sensitive values are:

  1. EuclideanDistance (default),

  2. ManhattanDistance,

  3. BinaryDistance

Types: character

warp.path

Optional Argument.
Determines whether to output the warping path.
Default Value: FALSE
Types: logical

data.sequence.column

Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)

template.data.sequence.column

Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "template.data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)

mapping.data.sequence.column

Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "mapping.data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)

Value

Function returns an object of class "td_dtw_mle" which is a named list containing objects of class "tbl_teradata". Named list member can be referenced directly with the "$" operator using name: result.

Examples

  
    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("dtw_example", "timeseriesdata", "templatedata", "mappingdata")
    
    # Create object(s) of class "tbl_teradata".
    timeseriesdata <- tbl(con, "timeseriesdata")
    templatedata <- tbl(con, "templatedata")
    mappingdata <- tbl(con, "mappingdata")
    
    # Example 1 -
    # This example compares multiple time series to both a common template and each other.
    # Each time series represents stock prices and the template represents a series
    # of stock index prices.
    td_dtw_mle_out <- td_dtw_mle(data = timeseriesdata,
                                 data.partition.column = c("timeseriesid"),
                                 data.order.column = c("timestamp1"),
                                 template.data = templatedata,
                                 template.data.order.column = c("timestamp2"),
                                 mapping.data = mappingdata,
                                 mapping.data.partition.column = c("timeseriesid"),
                                 input.columns = c("stockprice", "timestamp1"),
                                 template.columns = c("indexprice", "timestamp2"),
                                 timeseries.id = "timeseriesid",
                                 template.id = "templateid"
                                 )