More Web Proxy on the site http://driver.im/

Article

An efficient and accurate method for evaluating time series similarity

Authors:

Michael D. Morse,

Jignesh M. PatelAuthors Info & Claims

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

Pages 569 - 580

https://doi.org/10.1145/1247480.1247544

Published: 11 June 2007 Publication History

Abstract

A variety of techniques currently exist for measuring the similarity between time series datasets. Of these techniques, the methods whose matching criteria is bounded by a specified ε threshold value, such as the LCSS and the EDR techniques, have been shown to be robust in the presence of noise, time shifts, and data scaling. Our work proposes a new algorithm, called the Fast Time Series Evaluation (FTSE) method, which can be used to evaluate such threshold value techniques, including LCSS and EDR. Using FTSE, we show that these techniques can be evaluated faster than using either traditional dynamic programming or even warp-restricting methods such as the Sakoe-Chiba band and the Itakura Parallelogram.

We also show that FTSE can be used in a framework that can evaluate a richer range of ε threshold-based scoring techniques, of which EDR and LCSS are just two examples. This framework, called Swale, extends the ε threshold-based scoring techniques to include arbitrary match rewards and gap penalties. Through extensive empirical evaluation, we show that Swale can obtain greater accuracy than existing methods.

References

[1]

R. Agarwal, C. Faloutsos, and A. R. Swami. Efficient Similarity Search in Sequence Databases. In FODO, pages 69--84, 1993.

[2]

D. Berndt and J. Clifford. Using Dynamic Time Warping to Find Patterns in Time Series. In AAAI-94 Workshop on Knowledge Discovery in Databases, pages 359--370, 1994.

[3]

K. Chan and A. C. Fu. Efficient Time Series Matching by Wavelets. In ICDE, pages 126--133, 1999.

Digital Library

[4]

L. Chen and R. Ng. On the Marriage of Lp-norms and Edit Distance. In VLDB, pages 792--803, 2004.

Digital Library

[5]

L. Chen, M. T. Özsu, and V. Oria. Robust and Fast Similarity Search for Moving Object Trajectories. In SIGMOD, pages 491--502, 2005.

Digital Library

[6]

R. Cilibrasi and P. M. B. Vitanyi. Clustering by Compression. IEEE Transactions on Information Theory, 51(4):1523--1545, 2005.

Digital Library

[7]

C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast Subsequence Matching in Time-Series Databases. In SIGMOD, pages 419--429, 1994.

Digital Library

[8]

J. Gips, M. Betke, and P. Fleming. The Camera Mouse: Preliminary Investigation of Automated Visual Tracking for Computer Access. In Proc. of Rehabilitation Engineering and Assistive Technology Society of North America(RESNA), pages 98--100, 2000.

[9]

D. Goldin and P. Kanellakis. On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. In Constraint Programming, pages 137--153, 1995.

Digital Library

[10]

J. W. Hunt and T. G. Szymanski. A Fast Algorithm for Computing Longest Common Subsequences. CACM, pages 350--353, 1977.

Digital Library

[11]

F. Itakura. Minimum Prediction Residual Principle Applied to Speech Recognition. IEEE Trans. Acoustics, Speech, and Signal Proc., Vol. ASSP-23:52--72, 1975.

[12]

A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice-Hall, 1988.

Digital Library

[13]

E. Keogh. Exact Indexing of Dynamic Time Warping. In VLDB, pages 406--417, 2002.

Digital Library

[14]

E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra. Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases. KAIS, 3(3):263--286, 2000.

[15]

E. Keogh and S. Kasetty. The Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In SIGKDD, pages 102--111, 2002.

Digital Library

[16]

E. Keogh, S. Lonardi, and C. A. Ratanamahatana. Towards Parameter-Free Data Mining. In SIGKDD, pages 206--215, 2004.

Digital Library

[17]

E. Keogh and M. Pazzani. Scaling Up Dynamic Time Warping to Massive Datasets. In PKDD, pages 1--11, 1999.

Digital Library

[18]

E. J. Keogh, K. Chakrabarti, S. Mehrotra, and M. J. Pazzani. Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In SIGMOD, pages 151--162, 2001.

Digital Library

[19]

S. W. Kim, S. Park, and W. W. Chu. An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases. In ICDE, pages 607--614, 2001.

Digital Library

[20]

F. Korn, H. Jagadish, and C. Faloutsos. Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In SIGMOD, pages 289--300, 1997.

Digital Library

[21]

S. Kuo and G. R. Cross. An Improved Algorithm to Find the Length of the Longest Common Subsequence of Two Strings. ACM SIGIR Forum, 23(3-4):89--99, 1989.

Digital Library

[22]

Climate variability and predictability website. http://www.clivar.org.

[23]

I. Popivanov and R. Miller. Similarity Search Over Time Series Data Using Wavelets. In ICDE, page 212, 2001.

Digital Library

[24]

C. Ratanamahatana and E. Keogh. Making Time-series Classification More Accurate Using Learned Constraints. In SIAM International Conference on Data Mining, 2004.

[25]

S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, 1995.

Digital Library

[26]

H. Sakoe and S. Chiba. Dynamic Programming Algorithm Optimization for Spoken Word Recognition. IEEE Trans. Acoustics, Speech, and Signal Proc.,Vol. ASSP-26(1):43--49, 1978.

[27]

Y. Sakurai, M. Yoshikawa, and C. Faloutsos. FTW: Fast Similarity Search under the Time Warping Distance. In PODS, pages 326--337, 2005.

Digital Library

[28]

S. Hettich and S. D. Bay. The UCI KDD Archive http://kdd.ics.uci.edu.

[29]

M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, and E. Keogh. Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures. In SIGKDD, pages 216--225, 2003.

Digital Library

[30]

M. Vlachos, G. Kollios, and D. Gunopulos. Discovering Similar Multidimensional Trajectories. In ICDE, pages 673--684, 2002.

Digital Library

[31]

J. Yang and W. Wang. CLUSEQ: Efficient and Effective Sequence Clustering. In ICDE, pages 101--112, 2003.

[32]

B. K. Yi and C. Faloutsos. Fast Time Sequence Indexing for Arbitrary Lp Norms. In VLDB, pages 385--394, 2000.

Digital Library

[33]

B. K. Yi, H. V. Jagadish, and C. Faloutsos. Efficient Retrieval of Similar Time Sequences Under Time Warping. In ICDE, pages 201--208, 1998.

Digital Library

[34]

Y. Zhu and D. Shasha. Warping Indexes with Envelope Transforms for Query by Humming. In SIGMOD, pages 181--192, 2003.

Digital Library

Cited By

Martínez García EAlberti MArcos Álvarez A(2025)Generation of Penetrometric Profile of the Soil Applying Machine Learning to Measure While Drilling Data from Deep Foundation MachineryApplied Sciences10.3390/app1503133115:3(1331)Online publication date: 27-Jan-2025
https://doi.org/10.3390/app15031331
Mohammed JBöhlen MHelmer SBaeza-Yates RBonchi F(2024)Quantifying and Estimating the Predictability Upper Bound of Univariate Numeric Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671995(2236-2247)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671995
Atanasovska AZdraveski VSpasov D(2024)Parallelizing Dynamic Time Warping Algorithm for Hotel Occupancy Time Series Similarity Measurement2024 47th MIPRO ICT and Electronics Convention (MIPRO)10.1109/MIPRO60963.2024.10569678(1974-1978)Online publication date: 20-May-2024
https://doi.org/10.1109/MIPRO60963.2024.10569678
Show More Cited By

Index Terms

An efficient and accurate method for evaluating time series similarity
1. Human-centered computing
  1. Visualization
    1. Visualization application domains
      1. Geographic visualization
2. Information systems
  1. Information systems applications
    1. Data mining
    2. Spatial-temporal systems

Recommendations

An empirical evaluation of similarity measures for time series classification

Time series are ubiquitous, and a measure to assess their similarity is a core part of many computational systems. In particular, the similarity measure is the most essential ingredient of time series clustering and classification systems. Because of ...
Time series similarity measure based on the function of degree of disagreement
KSEM'11: Proceedings of the 5th international conference on Knowledge Science, Engineering and Management

Similarity measure is a basic task in time series data mining and attracts much attention in the last decade. This paper considers time series similarity measure from an information theoretic perspective. Based on the function of degree of disagreement (...
Similarity Measure of Multivariate Time Series Based on Segmentation
ICMLC '20: Proceedings of the 2020 12th International Conference on Machine Learning and Computing

Similarity measure of time series is a fundamental problem in data mining tasks. However, most of the similarity methods are mainly for univariate time series, rather than multivariate time series. Among the existing approaches for multivariate time ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data

June 2007

1210 pages

ISBN:9781595936868

DOI:10.1145/1247480

General Chairs:
Lizhu Zhou
Tsinghua University, China
,
Tok Wang Ling
National University of Singapore, Singapore
,
Program Chair:
Beng Chin Ooi
National University of Singapore, Singapore

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGMOD/PODS07

Sponsor:

SIGMOD/PODS07: International Conference on Management of Data

June 11 - 14, 2007

Beijing, China

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

126
Total Citations
View Citations
683
Total Downloads

Downloads (Last 12 months)116
Downloads (Last 6 weeks)10

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Martínez García EAlberti MArcos Álvarez A(2025)Generation of Penetrometric Profile of the Soil Applying Machine Learning to Measure While Drilling Data from Deep Foundation MachineryApplied Sciences10.3390/app1503133115:3(1331)Online publication date: 27-Jan-2025
https://doi.org/10.3390/app15031331
Mohammed JBöhlen MHelmer SBaeza-Yates RBonchi F(2024)Quantifying and Estimating the Predictability Upper Bound of Univariate Numeric Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671995(2236-2247)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671995
Atanasovska AZdraveski VSpasov D(2024)Parallelizing Dynamic Time Warping Algorithm for Hotel Occupancy Time Series Similarity Measurement2024 47th MIPRO ICT and Electronics Convention (MIPRO)10.1109/MIPRO60963.2024.10569678(1974-1978)Online publication date: 20-May-2024
https://doi.org/10.1109/MIPRO60963.2024.10569678
Lee JKim DSim S(2024)Temporal Multi-Features Representation Learning-Based Clustering for Time-Series DataIEEE Access10.1109/ACCESS.2024.341734812(87675-87690)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3417348
Braig MZeiler P(2024)A Study on the Effectiveness of Time Series Similarity Measures for the Comparison of Degradation Curves of Similar Engineering SystemsIEEE Access10.1109/ACCESS.2024.338469712(49602-49623)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3384697
Naser ANaser M(2024)SPINEX-TimeSeries: Similarity-based predictions with explainable neighbors exploration for time series and forecasting problemsComputers & Industrial Engineering10.1016/j.cie.2024.110812(110812)Online publication date: Dec-2024
https://doi.org/10.1016/j.cie.2024.110812
Tanimu BHamed MBello AAbdullahi SAjibike MShahid S(2024)Selecting the optimal gridded climate dataset for Nigeria using advanced time series similarity algorithmsEnvironmental Science and Pollution Research10.1007/s11356-024-32128-0Online publication date: 3-Feb-2024
https://doi.org/10.1007/s11356-024-32128-0
Paparrizos JWu KElmore AFaloutsos CFranklin M(2023)Accelerating Similarity Search for Elastic Measures: A Study and New Generalization of Lower Bounding DistancesProceedings of the VLDB Endowment10.14778/3594512.359453016:8(2019-2032)Online publication date: 22-Jun-2023
https://dl.acm.org/doi/10.14778/3594512.3594530
Chakraborty RHolm JPedersen TPopovski P(2023)Finding Representative Sampling Subsets in Sensor Graphs Using Time-series SimilaritiesACM Transactions on Sensor Networks10.1145/359518119:4(1-32)Online publication date: 9-Jun-2023
https://dl.acm.org/doi/10.1145/3595181
Novokhrestova DKostyuchenko EBorovskoy A(2023)Classifier-Based Combined Measure of Syllable Pronunciation Similarity in Speech RehabilitationProceedings of the Seventh International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’23)10.1007/978-3-031-43792-2_13(127-136)Online publication date: 18-Sep-2023
https://doi.org/10.1007/978-3-031-43792-2_13
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten