DTW Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product

Teradata Vantage

Release Number

8.00

1.0

Published

May 2019

Language

English (United States)

Last Update

2019-11-22

dita:mapPath

blj1506016597986.ditamap

dita:ditavalPath

blj1506016597986.ditaval

dita:id

B700-4003

lifecycle

Product Category

Teradata Vantage™

This example compares multiple time series to both a common template and each other. Each time series represents stock prices and the template represents a series of stock index prices.

Input

input_table: timeseriesdata
timeseriesid	timestamp1	stockprice
1	0	24.2019
1	0.025063	27.8701
1	0.050125	31.4969
1	0.075188	35.083
1	0.100251	38.6286
1	0.125313	42.1343
1	0.150376	45.6005
1	0.175439	49.0276
1	0.200501	52.4162
1	0.225564	55.7666
1	0.250627	59.0792
...	...	...

template_table: templatedata
templateid	timestamp2	index_price
1	0	0
1	0.025063	0
1	0.050125	0
1	0.075188	0
1	0.100251	0
1	0.125313	0
1	0.150376	0
1	0.175439	0
1	0.200501	0
1	0.225564	0
1	0.250627	0
...	...	...

mapping_table: mappingdata
timeseriesid	templateid
1	1
1	2
1	3
2	1
2	2
2	3
3	1
3	2
3	3
4	1
4	2
4	3

SQL Call

SELECT * FROM DTW (
  ON timeseriesdata AS input_table
    PARTITION BY timeseriesid
    ORDER BY timestamp1
  ON templatedata AS template_table DIMENSION
    ORDER BY timestamp2
  ON mappingdata AS mapping_table
    PARTITION BY timeseriesid
  USING
  TargetColumns ('stockprice', 'timestamp1')
  TemplateColumns ('indexprice', 'timestamp2')
  TimeSeriesID ('timeseriesid')
  TemplateID ('templateid')
) AS dt ORDER BY "timeseries_id";

Output

timeseries_id	template_id	warp_distance
1	1	25163.9
1	2	7547.69
1	3	19577.6
2	1	132.669
2	2	1904.08
2	3	71.7805
3	1	351.676
3	2	3614.2
3	3	75.7767
4	1	4927.61
4	2	914.257s
4	3	16641.6

Plot and Interpretation of Results

The warping distance is an unnormalized measure of how dissimilar two time series are. The warp_distance column in the output table has the warping distance for all pairs in the mapping table; that is, for every timeseries_id and template_id number.

The figure shows that input 2 is more similar to templates 1 and 3 than to template 2. The warp distances also show this:

Template	Warp Distance
1	131.588
2	106.131
3	~540

Because the dissimilarity of two time series is not based on whether they are temporarily close (the time is stretched and the two time series that are offset by a constant time interval are effectively the same), input 3 is not very dissimilar to templates 1 and 3. However, input 4 has the largest warping distance measure from templates 1 and 3, as the curvature of the latter 2 is far from input 4. Time stretching brings input 4 closer to templates 1 and 3, but with a larger warping path (not output above) and therefore, a larger warping distance.