More Web Proxy on the site http://driver.im/

research-article

Query-sensitive distance measure selection for time series nearest neighbor classification

Authors: Alexios Kotsifakos, Vassilis Athitsos, Panagiotis PapapetrouAuthors Info & Claims

Intelligent Data Analysis, Volume 20, Issue 1

Pages 5 - 27

https://doi.org/10.3233/IDA-150791

Published: 18 January 2016 Publication History

Abstract

Many distance or similarity measures have been proposed for time series similarity search. However, none of these measures is guaranteed to be optimal when used for 1-Nearest Neighbor (NN) classification. In this paper we study the problem of selecting the most appropriate distance measure, given a pool of time series distance measures and a query, so as to perform NN classification of the query. We propose a framework for solving this problem, by identifying, given the query, the distance measure most likely to produce the correct classification result for that query. From this proposed framework, we derive three specific methods, that differ from each other in the way they estimate the probability that a distance measure correctly classifies a query object. In our experiments, our pool of measures consists of Dynamic Time Warping (DTW), Move-Split-Merge (MSM), and Edit distance with Real Penalty (ERP). Based on experimental evaluation with 45 datasets, the best-performing of the three proposed methods provides the best results in terms of classification error rate, compared to the competitors, which include using the Cross Validation method for selecting the distance measure in each dataset, as well as using a single specific distance measure (DTW, MSM, or ERP) across all datasets.

References

[1]

Agrawal R., Faloutsos C. and Swami A., Efficient similarity search in sequence databases, in: In Proc of FODO, Springer Verlag (1993), 69-84.

[2]

Athitsos V., Hadjieleftheriou M., Kollios G. and Sclaroff S., Query-sensitive embeddings, ACM Trans Database Syst 32(2) (2007).

[3]

Athitsos V., Papapetrou P., Potamias M., Kollios G. and Gunopulos D., Approximate embedding-based subsequence matching of time series, in: SIGMOD (2008), 365-378.

[4]

Bellman R., The theory of dynamic programming, Bull Amer Math Soc 60(6) (1954), 503-515.

[5]

Bergroth L., Hakonen H. and Raita T., A survey of longest common subsequence algorithms, in: Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00), SPIRE '00, (2000), 39.

[6]

Bollobás B., Das G., Gunopulos D. and Mannila H., Time-series similarity problems and well-separated geometric sets, in: Symposium on Computational Geometry (1997), 454-456.

[7]

Chen L. and Ng R., On the marriage of l_p-norms and edit distance, in: VLDB (2004), 792-803.

[8]

Chen L. and Özsu M.T., Robust and fast similarity search for moving object trajectories, in: SIGMOD (2005), 491-502.

[9]

Chen Y., Nascimento M.A., Chin B., Anthony O. and Tung K.H., Spade: On shape-based pattern detection in streaming time series, in: Proceedings of the IEEE International Conference of Data Engineering (ICDE), (2007), 786-795.

[10]

Cox D., Principles of Statistical Inference, Cambridge University Press, 2006.

[11]

Domeniconi C., Peng J. and Gunopulos D., Locally adaptive metric nearest-neighbor classification, Pattern Analysis and Machine Intelligence, IEEE Transactions on 24(9) (2002), 1281-1285.

Digital Library

[12]

Hastie T. and Tibshirani R., Discriminant adaptive nearest neighbor classification, Pattern Analysis and Machine Intelligence, IEEE Transactions on 18(6) (1996), 607-616.

Digital Library

[13]

Hu B., Chen Y. and Keogh E., Time series classification under more realistic assumptions, in: SDM (2012), 578-586.

[14]

Keogh E., Exact indexing of dynamic time warping, in: VLDB (2002), 406-417.

[15]

Keogh E., Zhu Q., Hu B., Hao Y., Xi X., Wei L. and Ratanamahatana C., The UCR time series classification/clustering homepage: www.cs.ucr.edu/eamonn/time_series_data/, 2011.

[16]

Kotsifakos A., Papapetrou P., Hollmén J. and Gunopulos D., A subsequence matching with gaps-range-tolerances framework: A query-by-humming application, PVLDB 4(11) (2011), 761-771.

Digital Library

[17]

Kruskall J.B. and Liberman M., The symmetric time warping algorithm: From continuous to discrete, in: Time Warps, Addison-Wesley (1983).

[18]

Levenshtein V.I., Binary codes capable of correcting deletions, insertions, and reversals, Soviet Physics 10(8) (1966), 707-710.

[19]

Lin J., Keogh E., Lonardi S. and Chiu B., A symbolic representation of time series, with implications for streaming algorithms, in: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, DMKD '03, (2003), 2-11.

[20]

Lin J., Khade R. and Li Y., Rotation-invariant similarity in time series using bag-of-patterns representation, J Intell Inf Syst 39(2) (2012), 287-315.

Digital Library

[21]

Lines J., Davis L.M., Hills J. and Bagnall A., A shapelet transform for time series classification, in: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '12, (2012), 289-297.

[22]

Marteau P.-F., Time warp edit distance with stiffness adjustment for time series matching, Pattern Analysis and Machine Intelligence, IEEE Transactions on 31(2) (2009), 306-318.

Digital Library

[23]

Mueen A., Keogh E. and Young N., Logical-shapelets: An expressive primitive for time series classification, in: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '11, (2011), 1154-1162.

[24]

Sakoe H. and Chiba S., Dynamic programming algorithm optimization for spoken word recognition, Transactions on ASSP 26 (1978), 43-49.

[25]

Stefan A., Athitsos V. and Das G., The move-split-merge metric for time series, Transactions on Knowledge and Data Engineering, (2012).

[26]

Unal E., Chew E., Georgiou P. and Narayanan S., Challenging uncertainty in query by humming systems: A fingerprinting approach, Transactions on Audio Speech and Language Processing 16(2) (2008), 359-371.

Digital Library

[27]

Wang X., Mueen A., Ding H., Trajcevski G., Scheuermann P. and Keogh E.J., Experimental comparison of representation methods and distance measures for time series data, Data Minining and Knowledge Discovery 26(2) (2013), 275-309.

Digital Library

[28]

Xing Z., Pei J., Yu P.S. and Wang K., Extracting interpretable features for early classification on time series, in: SDM (2011), 247-258.

[29]

Ye L. and Keogh E., Time series shapelets: A novel technique that allows accurate, interpretable and fast classification, Data Min Knowl Discov 22(1-2) (2011), 149-182.

Digital Library

Cited By

Meng QPu P(2018)RLSIntelligent Data Analysis10.3233/IDA-17371722:4(767-785)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.3233/IDA-173717

Index Terms

Query-sensitive distance measure selection for time series nearest neighbor classification
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
2. Information systems
  1. Information retrieval
  2. Information systems applications
    1. Data mining

Index terms have been assigned to the content through auto-classification.

Recommendations

Fast time series classification under lucky time warping distance
SAC '14: Proceedings of the 29th Annual ACM Symposium on Applied Computing

In time series mining, the Dynamic Time Warping (DTW) distance is a commonly and widely used similarity measure. Since the computational complexity of the DTW distance is quadratic, various kinds of warping constraints, lower bounds and abstractions ...
Nearest neighbour distance matrix classification
ADMA'10: Proceedings of the 6th international conference on Advanced data mining and applications: Part I

A distance based classification is one of the popular methods for classifying instances using a point-to-point distance based on the nearest neighbour or k-NEAREST NEIGHBOUR (k-NN). The representation of distance measure can be one of the various ...
Optimized distance metrics for differential evolution based nearest prototype classifier

In this article, we introduce a differential evolution based classifier with extension for selecting automatically the applied distance measure from a predefined pool of alternative distances measures to suit optimally for classifying the particular ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Intelligent Data Analysis

Intelligent Data Analysis Volume 20, Issue 1

Jan 2016

211 pages

ISSN:1088-467X

Issue’s Table of Contents

IOS Press and the authors. All rights reserved.

Publisher

IOS Press

Netherlands

Publication History

Published: 18 January 2016

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Meng QPu P(2018)RLSIntelligent Data Analysis10.3233/IDA-17371722:4(767-785)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.3233/IDA-173717

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents