DTW Example | Teradata Vantage - DTW Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example compares multiple time series to both a common template and each other. Each time series represents stock prices and the template represents a series of stock index prices.

Input

InputTable: timeseriesdata
timeseriesid timestamp1 stockprice
1 0 24.2019
1 0.025063 27.8701
1 0.050125 31.4969
1 0.075188 35.083
1 0.100251 38.6286
1 0.125313 42.1343
1 0.150376 45.6005
1 0.175439 49.0276
1 0.200501 52.4162
1 0.225564 55.7666
1 0.250627 59.0792
... ... ...
ReferenceTable: templatedata
templateid timestamp2 index_price
1 0 0
1 0.025063 0
1 0.050125 0
1 0.075188 0
1 0.100251 0
1 0.125313 0
1 0.150376 0
1 0.175439 0
1 0.200501 0
1 0.225564 0
1 0.250627 0
... ... ...

SQL Call

SELECT * FROM DTW (
  ON timeseriesdata AS InputTable PARTITION BY timeseriesid ORDER BY timestamp1
  ON templatedata AS ReferenceTable DIMENSION ORDER BY templateid, timestamp2
  USING
  TargetColumns ('stockprice', 'timestamp1')
  TemplateColumns ('indexprice', 'timestamp2')
  TimeSeriesId ('timeseriesid')
  TemplateId ('templateid')
) AS dt ORDER BY 1, 2;

Output

 timeseries_id template_id warp_distance      
 ------------- ----------- ------------------ 
             1           1      25163.9453125
             1           2      7547.69140625
             1           3    19577.634765625
             2           1 132.66847229003906
             2           2  1904.084716796875
             2           3  71.78057861328125
             3           1  351.6783142089844
             3           2   3614.21240234375
             3           3  75.77669525146484
             4           1   4927.62451171875
             4           2  914.2536010742188
             4           3          16641.625

Plot and Interpretation of Results



The warping distance is an unnormalized measure of how dissimilar two time series are. The warp_distance column in the output table has the warping distance for all pairs in the mapping table; that is, for every timeseries_id and template_id number.

The figure shows that input 2 is more similar to templates 1 and 3 than to template 2. The warp distances also show this:
Template Warp Distance
1 131.588
2 106.131
3 ~540

Because the dissimilarity of two time series is not based on whether they are temporarily close (the time is stretched and the two time series that are offset by a constant time interval are effectively the same), input 3 is not very dissimilar to templates 1 and 3. However, input 4 has the largest warping distance measure from templates 1 and 3, as the curvature of the latter 2 is far from input 4. Time stretching brings input 4 closer to templates 1 and 3, but with a larger warping path (not output above) and therefore, a larger warping distance.

Download a zip file of all examples and a SQL script file that creates their input tables.