Any classification task that must preserve ordering can be characterized as time-series classification. Many real-world use cases involve data that varies only slightly. Traditional classifiers may be unable to classify such data with high precision.
Shapelets are contiguous subsequences of a time series that identify a class with high accuracy. Because shapelets focus on local features of a time series, they can be more accurate and faster than other time-series classification methods. Shapelets can also identify interpretable results, providing useful insights into differences between classes.
The most common use cases are long-term trends with small local pattern changes that distinguish trends from each other. Almost any time-series classification problem can be mapped to a shapelets discovery problem. For example:
- Clickstream analysis
- Scientific or health applications such as ECG analysis
- Imaging applications such as gesture recognition or motion analysis
- Manufacturing applications such as process anomaly detection
- Financial applications such as stock price analysis
Before a shapelets function classifies or clusters a set of time series, it normalizes and SAX-encodes them. Normalization is required because shapelet classification depends on the distance between two time series. SAX-encoding makes patterns in the data easier to identify and compare. For more information about SAX-encoding, see SAX2.
The following references explain in detail how shapelets are identified. Aster Analytics’ implementation of shapelets is based on the fast shapelet finder algorithm published by Rakthanmanon. The unsupervised shapelet implementation is based on the scalable unsupervised-shapelet algorithm published by Ulanova.
- L. Ye, E. Keogh. Time Series Shapelets: A New primitive for Data Mining, KDD 2009
- T. Rakthanmanon, E. Keogh. Fast Shapelets: A scalable algorithm for discovering time series shapelets, SIAM 2013.
- J. Zakaria, A. Mueen, E. Keogh. Clustering Time Series using Unsupervised-Shapelets.
- L. Ulanova, N. Begum, E. Keogh. Scalable Clustering of Time Series with U-Shapelets