[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1247480.1247544acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

An efficient and accurate method for evaluating time series similarity

Published: 11 June 2007 Publication History

Abstract

A variety of techniques currently exist for measuring the similarity between time series datasets. Of these techniques, the methods whose matching criteria is bounded by a specified ε threshold value, such as the LCSS and the EDR techniques, have been shown to be robust in the presence of noise, time shifts, and data scaling. Our work proposes a new algorithm, called the Fast Time Series Evaluation (FTSE) method, which can be used to evaluate such threshold value techniques, including LCSS and EDR. Using FTSE, we show that these techniques can be evaluated faster than using either traditional dynamic programming or even warp-restricting methods such as the Sakoe-Chiba band and the Itakura Parallelogram.
We also show that FTSE can be used in a framework that can evaluate a richer range of ε threshold-based scoring techniques, of which EDR and LCSS are just two examples. This framework, called Swale, extends the ε threshold-based scoring techniques to include arbitrary match rewards and gap penalties. Through extensive empirical evaluation, we show that Swale can obtain greater accuracy than existing methods.

References

[1]
R. Agarwal, C. Faloutsos, and A. R. Swami. Efficient Similarity Search in Sequence Databases. In FODO, pages 69--84, 1993.
[2]
D. Berndt and J. Clifford. Using Dynamic Time Warping to Find Patterns in Time Series. In AAAI-94 Workshop on Knowledge Discovery in Databases, pages 359--370, 1994.
[3]
K. Chan and A. C. Fu. Efficient Time Series Matching by Wavelets. In ICDE, pages 126--133, 1999.
[4]
L. Chen and R. Ng. On the Marriage of Lp-norms and Edit Distance. In VLDB, pages 792--803, 2004.
[5]
L. Chen, M. T. Özsu, and V. Oria. Robust and Fast Similarity Search for Moving Object Trajectories. In SIGMOD, pages 491--502, 2005.
[6]
R. Cilibrasi and P. M. B. Vitanyi. Clustering by Compression. IEEE Transactions on Information Theory, 51(4):1523--1545, 2005.
[7]
C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast Subsequence Matching in Time-Series Databases. In SIGMOD, pages 419--429, 1994.
[8]
J. Gips, M. Betke, and P. Fleming. The Camera Mouse: Preliminary Investigation of Automated Visual Tracking for Computer Access. In Proc. of Rehabilitation Engineering and Assistive Technology Society of North America(RESNA), pages 98--100, 2000.
[9]
D. Goldin and P. Kanellakis. On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. In Constraint Programming, pages 137--153, 1995.
[10]
J. W. Hunt and T. G. Szymanski. A Fast Algorithm for Computing Longest Common Subsequences. CACM, pages 350--353, 1977.
[11]
F. Itakura. Minimum Prediction Residual Principle Applied to Speech Recognition. IEEE Trans. Acoustics, Speech, and Signal Proc., Vol. ASSP-23:52--72, 1975.
[12]
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice-Hall, 1988.
[13]
E. Keogh. Exact Indexing of Dynamic Time Warping. In VLDB, pages 406--417, 2002.
[14]
E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra. Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases. KAIS, 3(3):263--286, 2000.
[15]
E. Keogh and S. Kasetty. The Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In SIGKDD, pages 102--111, 2002.
[16]
E. Keogh, S. Lonardi, and C. A. Ratanamahatana. Towards Parameter-Free Data Mining. In SIGKDD, pages 206--215, 2004.
[17]
E. Keogh and M. Pazzani. Scaling Up Dynamic Time Warping to Massive Datasets. In PKDD, pages 1--11, 1999.
[18]
E. J. Keogh, K. Chakrabarti, S. Mehrotra, and M. J. Pazzani. Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In SIGMOD, pages 151--162, 2001.
[19]
S. W. Kim, S. Park, and W. W. Chu. An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases. In ICDE, pages 607--614, 2001.
[20]
F. Korn, H. Jagadish, and C. Faloutsos. Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In SIGMOD, pages 289--300, 1997.
[21]
S. Kuo and G. R. Cross. An Improved Algorithm to Find the Length of the Longest Common Subsequence of Two Strings. ACM SIGIR Forum, 23(3-4):89--99, 1989.
[22]
Climate variability and predictability website. http://www.clivar.org.
[23]
I. Popivanov and R. Miller. Similarity Search Over Time Series Data Using Wavelets. In ICDE, page 212, 2001.
[24]
C. Ratanamahatana and E. Keogh. Making Time-series Classification More Accurate Using Learned Constraints. In SIAM International Conference on Data Mining, 2004.
[25]
S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice-Hall, 1995.
[26]
H. Sakoe and S. Chiba. Dynamic Programming Algorithm Optimization for Spoken Word Recognition. IEEE Trans. Acoustics, Speech, and Signal Proc.,Vol. ASSP-26(1):43--49, 1978.
[27]
Y. Sakurai, M. Yoshikawa, and C. Faloutsos. FTW: Fast Similarity Search under the Time Warping Distance. In PODS, pages 326--337, 2005.
[28]
S. Hettich and S. D. Bay. The UCI KDD Archive http://kdd.ics.uci.edu.
[29]
M. Vlachos, M. Hadjieleftheriou, D. Gunopulos, and E. Keogh. Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures. In SIGKDD, pages 216--225, 2003.
[30]
M. Vlachos, G. Kollios, and D. Gunopulos. Discovering Similar Multidimensional Trajectories. In ICDE, pages 673--684, 2002.
[31]
J. Yang and W. Wang. CLUSEQ: Efficient and Effective Sequence Clustering. In ICDE, pages 101--112, 2003.
[32]
B. K. Yi and C. Faloutsos. Fast Time Sequence Indexing for Arbitrary Lp Norms. In VLDB, pages 385--394, 2000.
[33]
B. K. Yi, H. V. Jagadish, and C. Faloutsos. Efficient Retrieval of Similar Time Sequences Under Time Warping. In ICDE, pages 201--208, 1998.
[34]
Y. Zhu and D. Shasha. Warping Indexes with Envelope Transforms for Query by Humming. In SIGMOD, pages 181--192, 2003.

Cited By

View all
  • (2024)Quantifying and Estimating the Predictability Upper Bound of Univariate Numeric Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671995(2236-2247)Online publication date: 25-Aug-2024
  • (2024)Parallelizing Dynamic Time Warping Algorithm for Hotel Occupancy Time Series Similarity Measurement2024 47th MIPRO ICT and Electronics Convention (MIPRO)10.1109/MIPRO60963.2024.10569678(1974-1978)Online publication date: 20-May-2024
  • (2024)Temporal Multi-Features Representation Learning-Based Clustering for Time-Series DataIEEE Access10.1109/ACCESS.2024.341734812(87675-87690)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
June 2007
1210 pages
ISBN:9781595936868
DOI:10.1145/1247480
  • General Chairs:
  • Lizhu Zhou,
  • Tok Wang Ling,
  • Program Chair:
  • Beng Chin Ooi
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. time series
  3. trajectory similarity

Qualifiers

  • Article

Conference

SIGMOD/PODS07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)119
  • Downloads (Last 6 weeks)10
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Quantifying and Estimating the Predictability Upper Bound of Univariate Numeric Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671995(2236-2247)Online publication date: 25-Aug-2024
  • (2024)Parallelizing Dynamic Time Warping Algorithm for Hotel Occupancy Time Series Similarity Measurement2024 47th MIPRO ICT and Electronics Convention (MIPRO)10.1109/MIPRO60963.2024.10569678(1974-1978)Online publication date: 20-May-2024
  • (2024)Temporal Multi-Features Representation Learning-Based Clustering for Time-Series DataIEEE Access10.1109/ACCESS.2024.341734812(87675-87690)Online publication date: 2024
  • (2024)A Study on the Effectiveness of Time Series Similarity Measures for the Comparison of Degradation Curves of Similar Engineering SystemsIEEE Access10.1109/ACCESS.2024.338469712(49602-49623)Online publication date: 2024
  • (2024)SPINEX-TimeSeries: Similarity-based predictions with explainable neighbors exploration for time series and forecasting problemsComputers & Industrial Engineering10.1016/j.cie.2024.110812(110812)Online publication date: Dec-2024
  • (2024)Selecting the optimal gridded climate dataset for Nigeria using advanced time series similarity algorithmsEnvironmental Science and Pollution Research10.1007/s11356-024-32128-0Online publication date: 3-Feb-2024
  • (2023)Accelerating Similarity Search for Elastic Measures: A Study and New Generalization of Lower Bounding DistancesProceedings of the VLDB Endowment10.14778/3594512.359453016:8(2019-2032)Online publication date: 22-Jun-2023
  • (2023)Finding Representative Sampling Subsets in Sensor Graphs Using Time-series SimilaritiesACM Transactions on Sensor Networks10.1145/359518119:4(1-32)Online publication date: 9-Jun-2023
  • (2023)Classifier-Based Combined Measure of Syllable Pronunciation Similarity in Speech RehabilitationProceedings of the Seventh International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’23)10.1007/978-3-031-43792-2_13(127-136)Online publication date: 18-Sep-2023
  • (2022)Scalable Time Series Compound InfrastructureProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517888(1685-1698)Online publication date: 10-Jun-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media