Shapelet Functions - Teradata Vantage - MLE 8.00

Teradata® Vantage Machine Learning Engine Analytic Function Reference

prodname
Teradata Vantage
vrm_release
8.00
category
Programming Reference
featnum
B700-4003-098K

Shapelets are contiguous subsequences of a time series that identify a class with high accuracy. Because shapelets focus on local features of a time series, they can be more accurate and faster than other time-series classification methods. Shapelets can also identify interpretable results, providing useful insights into differences between classes.

Any classification task that must preserve ordering can be characterized as time-series classification. Many real-world use cases involve data that varies only slightly. Traditional classifiers may be unable to classify such data with high precision.

Common Use Cases

The most common use cases are long-term trends with small local pattern changes that distinguish trends from each other. Almost any time-series classification problem can be mapped to a shapelets discovery problem. For example:

  • Clickstream analysis
  • Scientific or health applications such as ECG analysis
  • Imaging applications such as gesture recognition or motion analysis
  • Manufacturing applications such as process anomaly detection
  • Financial applications such as stock price analysis

Normalization

Before a shapelets function classifies or clusters a set of time series, it normalizes and SAX-encodes them. Normalization is required because shapelet classification depends on the distance between two time series. SAX-encoding makes patterns in the data easier to identify and compare. For more information about SAX-encoding, see SAX.

References

The following references explain in detail how shapelets are identified. The ML Engine implementation of shapelets is based on the fast shapelet finder algorithm published by Rakthanmanon. The unsupervised shapelet implementation is based on the scalable unsupervised-shapelet algorithm published by Ulanova.

  • L. Ye, E. Keogh. Time Series Shapelets: A New primitive for Data Mining, KDD 2009
  • T. Rakthanmanon, E. Keogh. Fast Shapelets: A scalable algorithm for discovering time series shapelets, SIAM 2013.
  • J. Zakaria, A. Mueen, E. Keogh. Clustering Time Series using Unsupervised-Shapelets.
  • L. Ulanova, N. Begum, E. Keogh. Scalable Clustering of Time Series with U-Shapelets

Shapelet Functions

Function Description
ShapeletUnsupervised Takes a set of time series and assigns them to clusters, based on the shapelets that it finds.
ShapeletSupervised Takes a set of classified time series and outputs a model for classifying time series, based on the shapelets that it finds.
ShapeletSupervisedClassifier Takes a set of time series and assigns them to clusters, based on the model output by ShapeletSupervised.