Shapelet Functions (ML Engine) - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Shapelets are contiguous subsequences of a time series that identify a class with high accuracy. Because shapelets focus on local features of a time series, they can be more accurate and faster than other time-series classification methods. Shapelets can also identify interpretable results, providing useful insights into differences between classes.

Any classification task that must preserve ordering can be characterized as time-series classification. Many real-world use cases involve data that varies only slightly. Traditional classifiers may be unable to classify such data with high precision.

Common Use Cases

The most common use cases are long-term trends with small local pattern changes that distinguish trends from each other. Almost any time-series classification problem can be mapped to a shapelets discovery problem. For example:

  • Clickstream analysis
  • Scientific or health applications such as ECG analysis
  • Imaging applications such as gesture recognition or motion analysis
  • Manufacturing applications such as process anomaly detection
  • Financial applications such as stock price analysis

Normalization

Before a shapelets function classifies or clusters a set of time series, it normalizes and SAX-encodes them. Normalization is required because shapelet classification depends on the distance between two time series. SAX-encoding makes patterns in the data easier to identify and compare. For more information about SAX-encoding, see SAX (ML Engine).

References

The following references explain in detail how shapelets are identified. ML Engine implementation of shapelets is based on the fast shapelet finder algorithm published by Rakthanmanon. The unsupervised shapelet implementation is based on the scalable unsupervised-shapelet algorithm published by Ulanova.

  • L. Ye, E. Keogh. Time Series Shapelets: A New primitive for Data Mining, KDD 2009
  • T. Rakthanmanon, E. Keogh. Fast Shapelets: A scalable algorithm for discovering time series shapelets, SIAM 2013.
  • J. Zakaria, A. Mueen, E. Keogh. Clustering Time Series using Unsupervised-Shapelets.
  • L. Ulanova, N. Begum, E. Keogh. Scalable Clustering of Time Series with U-Shapelets

Shapelet Functions

Function Description
ShapeletUnsupervised (ML Engine) Takes a set of time series and assigns them to clusters, based on the shapelets that it finds.
ShapeletSupervised (ML Engine) Takes a set of classified time series and outputs a model for classifying time series, based on the shapelets that it finds.
ShapeletSupervisedClassifier (ML Engine) Takes a set of time series and assigns them to clusters, based on the model output by ShapeletSupervised.