ShapeletUnsupervised Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The input table has 10 price observations for four stocks. The column period contains time values, represented by integers. Because the function is unsupervised, it ignores the stock_category column; however, you can use that column to verify that the created clusters belong to the same category.

InputTable: ushapelets_input
stockid period stockprice stock_category
1 22418 460 Technology
1 22419 457 Technology
1 22420 452 Technology
1 22421 459 Technology
1 22422 462 Technology
1 22423 459 Technology
1 22424 463 Technology
1 22425 479 Technology
1 22426 493 Technology
1 22427 490 Technology
2 22418 66.62 Healthcare
2 22419 66.87 Healthcare
2 22420 67 Healthcare
2 22421 67.25 Healthcare
2 22422 65.88 Healthcare
2 22423 66.12 Healthcare
2 22424 66.5 Healthcare
2 22425 67.75 Healthcare
2 22426 67.5 Healthcare
2 22427 67.25 Healthcare
3 22418 66.42 Healthcare
3 22419 66.87 Healthcare
3 22420 66.6 Healthcare
3 22421 67.15 Healthcare
3 22422 65.68 Healthcare
3 22423 65.92 Healthcare
3 22424 66.7 Healthcare
3 22425 67.75 Healthcare
3 22426 68 Healthcare
3 22427 66.95 Healthcare
4 22418 489 Technology
4 22419 487 Technology
4 22420 485 Technology
4 22421 489 Technology
4 22422 490 Technology
4 22423 496 Technology
4 22424 497 Technology
4 22425 499 Technology
4 22426 498 Technology
4 22427 497 Technology

The following figure is a graphic representation of the input data.

ShapeletUnsupervised Example Input Data

In the time period shown, technology stocks 1 and 4 have similar price trajectories, as do healthcare stocks 2 and 3.

SQL Call

SELECT * FROM ShapeletUnsupervised (
	ON ushapelets_input AS InputTable
	OUT TABLE OutputTable (uss_output)
	USING
	IDColumn ('stockid')
	TimeColumn ('period')
	TargetColumn ('stockprice')
	SAXWindowSize (5)
) AS dt;

Output

 stats                                 
 ------------------------------------- 
 Unsupervised shapelets table created.
 number of clusters : 2               
 number of timeseries : 4

The function assigned technology stocks 1 and 4 to cluster 0 and healthcare stocks 2 and 3 to cluster 1.

SELECT * FROM uss_output;
 stockid cluster_label 
 ------- ------------- 
       4             0
       1             0
       2             1
       3             1

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.