[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1989323.1989364acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Finding semantics in time series

Published: 12 June 2011 Publication History

Abstract

In order to understand a complex system, we analyze its output or its log data. For example, we track a system's resource consumption (CPU, memory, message queues of different types, etc) to help avert system failures; we examine economic indicators to assess the severity of a recession; we monitor a patient's heart rate or EEG for disease diagnosis. Time series data is involved in many such applications. Much work has been devoted to pattern discovery from time series data, but not much has attempted to use the time series data to unveil a system's internal dynamics. In this paper, we go beyond learning patterns from time series data. We focus on obtaining a better understanding of its data generating mechanism, and we regard patterns and their temporal relations as organic components of the hidden mechanism. Specifically, we propose to model time series data using a novel pattern-based hidden Markov model (pHMM), which aims at revealing a global picture of the system that generates the time series data. We propose an iterative approach to refine pHMMs learned from the data. In each iteration, we use the current pHMM to guide time series segmentation and clustering, which enables us to learn a more accurate pHMM. Furthermore, we propose three pruning strategies to speed up the refinement process. Empirical results on real datasets demonstrate the feasibility and effectiveness of the proposed approach.

References

[1]
G. Box, G. Jenkins, and G. Reinsel. Time series analysis: forecasting and control. Prentice Hall, 1994.
[2]
K. Chan and W. Fu. Efficient time series matching by wavelets. In ICDE, 1999.
[3]
C. Chatfield. The analysis of time series: an introduction. Chapman & Hall/CRC, 2004.
[4]
S. Chen, H. Wang, and S. Zhou. Concept clustering of evolving data. In ICDE, 2009.
[5]
S. Chen, H. Wang, S. Zhou, and P. Yu. Stop chasing trends: Discovering high order models in evolving data. In ICDE, 2008.
[6]
G. Dangelmayr, S. Gadaleta, D. Hundley, and M. Kirby. Time series prediction by estimating markov probabilities through topology preserving maps. In SPIE, 1999.
[7]
H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh. Querying and mining of time series data: Experimental comparison of representations and distance measures. In VLDB, 2008.
[8]
C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast subsequence matching in time-series databases. In SIGMOD, 1994.
[9]
X. Gu and H. Wang. Online anomaly prediction for robust cluster systems. In ICDE, 2009.
[10]
E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. KAIS, 2000.
[11]
E. Keogh and M. Pazzani. An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In SIGKDD, 1998.
[12]
E. Keogh, S.Chu, D. Hart, and M. Pazzani. An online algorithm for segmenting time series. In ICDM, 2001.
[13]
E. J. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, 2006.
[14]
J. Lin, E. J. Keogh, L. Wei, and S. Lonardi. Experiencing sax: a novel symbolic representation of time series. In DMKD, 2007.
[15]
J. F. Mari, D. Fohr, and J. C. Junqira. A second-order hmm for high performance word and phoneme-based continuous speech recognition. In ICASSP, 1996.
[16]
A. Mueen and E. Keogh. Online discovery and maintenance of time series motifs. In SIGKDD, 2010.
[17]
A. Mueen, E. Keogh, and N. Bigdely-Shamlo. Finding time series motifs in disk-resident data. In ICDM, 2009.
[18]
C. Perng, H. Wang, S. R. Zhang, and D. Parker. Landmarks: A new model for similarity-based pattern querying in time series databases. In ICDE, 2000.
[19]
G. Reeves, J. Liu, S. Nath, and F. Zhao. Managing massive time series streams with multiscale compressed trickles. In VLDB, 2009.
[20]
D. Ron, Y. Singer, and N. Tishby. The power of amnesia: Learning probabilistic automata with variable memory length. Machine Learning, pages 117--149, 1996.
[21]
Y. Tan, X. Gu, and H. Wang. Adaptive system anomaly prediction for large-scale hosting infrastructures. In PODC, 2010.
[22]
L. Tang, B. Cui, H. Li, G. Miao, D. Yang, and X. Zhou. Effective variation management for pseudo periodical streams. In SIGMOD, 2007.
[23]
H. Wang, W. Fan, P. S. Yu, and J. Han. Mining concept-drifting data streams using ensemble classifiers. In SIGKDD, 2003.
[24]
H. Wang, J. Yin, J. Pei, P. S. Yu, and J. X. Yu. Suppressing model overfitting in mining concept-drifting data streams. In SIGKDD, 2006.
[25]
P. Wang, H. Wang, M. Liu, and W. Wang. An algorithmic approach to event summarization. In SIGMOD, 2010.
[26]
Y. Wang and L. Zhou. Mining complex time-series data by learning the temporal structure using bayesian techniques and markovian models. In ICDM, 2006.
[27]
J. J. Wiik and E. R. Selow. Cluster and calendar based visualization of time series data. In INFOVIS, 1999.
[28]
I. H. Witten and E. Frank. Data mining: practical machine learning tools and techniques. Morgan Kaufmann, 2005.

Cited By

View all
  • (2024)KARATECH: A Practice Support System Using an Accelerometer to Reduce the Preliminary Actions of KarateSensors10.3390/s2407230624:7(2306)Online publication date: 5-Apr-2024
  • (2024)Semantic Relationship-Based Unsupervised Representation Learning of Multivariate Time SeriesIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7046E107.D:2(191-200)Online publication date: 1-Feb-2024
  • (2024)Sequential Data Classification under Dynamic EmissionPattern Recognition and Image Analysis10.1134/S105466182401004834:1(187-198)Online publication date: 1-Mar-2024
  • Show More Cited By

Index Terms

  1. Finding semantics in time series

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
    June 2011
    1364 pages
    ISBN:9781450306614
    DOI:10.1145/1989323
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. hidden Markov model

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)54
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)KARATECH: A Practice Support System Using an Accelerometer to Reduce the Preliminary Actions of KarateSensors10.3390/s2407230624:7(2306)Online publication date: 5-Apr-2024
    • (2024)Semantic Relationship-Based Unsupervised Representation Learning of Multivariate Time SeriesIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7046E107.D:2(191-200)Online publication date: 1-Feb-2024
    • (2024)Sequential Data Classification under Dynamic EmissionPattern Recognition and Image Analysis10.1134/S105466182401004834:1(187-198)Online publication date: 1-Mar-2024
    • (2024)Shrink: Data Compression by Semantic Extraction and Residuals Encoding2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825287(650-659)Online publication date: 15-Dec-2024
    • (2023)Mimir: Finding Cost-efficient Storage Configurations in the Public CloudProceedings of the 16th ACM International Conference on Systems and Storage10.1145/3579370.3594776(22-34)Online publication date: 5-Jun-2023
    • (2023)RedPacketBike: A Graph-Based Demand Modeling and Crowd-Driven Station Rebalancing Framework for Bike Sharing SystemsIEEE Transactions on Mobile Computing10.1109/TMC.2022.314597922:7(4236-4252)Online publication date: 1-Jul-2023
    • (2023)Towards Efficient and Privacy-Preserving Interval Skyline Queries Over Time Series DataIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.315375920:2(1348-1363)Online publication date: 1-Mar-2023
    • (2023)Forecasting movements of stock time series based on hidden state guided deep learning approachInformation Processing & Management10.1016/j.ipm.2023.10332860:3(103328)Online publication date: May-2023
    • (2022)Semantics and Anomaly Preserving Sampling Strategy for Large-Scale Time Series DataACM/IMS Transactions on Data Science10.1145/35119182:4(1-25)Online publication date: 30-Mar-2022
    • (2021)Automated deep learning for trend prediction in time series data2021 IEEE 24th International Conference on Information Fusion (FUSION)10.23919/FUSION49465.2021.9626910(1-8)Online publication date: 1-Nov-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media