DTW Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example compares multiple time series to both a common template and each other. Each time series represents stock prices and the template represents a series of stock index prices.

Input

input_table: timeseriesdata
timeseriesid timestamp1 stockprice
1 0 24.2019
1 0.025063 27.8701
1 0.050125 31.4969
1 0.075188 35.083
1 0.100251 38.6286
1 0.125313 42.1343
1 0.150376 45.6005
1 0.175439 49.0276
1 0.200501 52.4162
1 0.225564 55.7666
1 0.250627 59.0792
... ... ...
template_table: templatedata
templateid timestamp2 index_price
1 0 0
1 0.025063 0
1 0.050125 0
1 0.075188 0
1 0.100251 0
1 0.125313 0
1 0.150376 0
1 0.175439 0
1 0.200501 0
1 0.225564 0
1 0.250627 0
... ... ...
mapping_table: mappingdata
timeseriesid templateid
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
4 1
4 2
4 3

SQL Call

SELECT * FROM DTW (
  ON timeseriesdata AS input_table
    PARTITION BY timeseriesid
    ORDER BY timestamp1
  ON templatedata AS template_table DIMENSION
    ORDER BY timestamp2
  ON mappingdata AS mapping_table
    PARTITION BY timeseriesid
  USING
  TargetColumns ('stockprice', 'timestamp1')
  TemplateColumns ('indexprice', 'timestamp2')
  TimeSeriesID ('timeseriesid')
  TemplateID ('templateid')
) AS dt ORDER BY "timeseries_id";

Output

timeseries_id template_id warp_distance
1 1 25163.9
1 2 7547.69
1 3 19577.6
2 1 132.669
2 2 1904.08
2 3 71.7805
3 1 351.676
3 2 3614.2
3 3 75.7767
4 1 4927.61
4 2 914.257s
4 3 16641.6

Plot and Interpretation of Results



The warping distance is an unnormalized measure of how dissimilar two time series are. The warp_distance column in the output table has the warping distance for all pairs in the mapping table; that is, for every timeseries_id and template_id number.

The figure shows that input 2 is more similar to templates 1 and 3 than to template 2. The warp distances also show this:
Template Warp Distance
1 131.588
2 106.131
3 ~540

Because the dissimilarity of two time series is not based on whether they are temporarily close (the time is stretched and the two time series that are offset by a constant time interval are effectively the same), input 3 is not very dissimilar to templates 1 and 3. However, input 4 has the largest warping distance measure from templates 1 and 3, as the curvature of the latter 2 is far from input 4. Time stretching brings input 4 closer to templates 1 and 3, but with a larger warping path (not output above) and therefore, a larger warping distance.