[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/SSDM.2000.869778guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

TSA-Tree: A Wavelet-Based Approach to Improve the Efficiency of Multi-Level Surprise and Trend Queries on Time-Series Data

Published: 26 July 2000 Publication History

Abstract

We introduce a novel wavelet-based tree structure, termed TSA-tree, which improves the efficiency of multi-level trend and surprise queries on time sequence data. With the explosion of scientific observation data (some conceptualized as time-sequences), we are facing the challenge of efficiently storing, retrieving and analyzing this data. Frequent queries on this data set are to find trends (e.g., global warming) or surprises (e.g., undersea volcano eruption) within the original time-series. The challenge, however, is that these trend and surprise queries are needed at different levels of abstractions (e.g., within the last week, last month, last year or last decade). To support these multi-level trend and surprise queries, sometimes-huge subset of raw data needs to be retrieved and processed. To expedite this process, we utilize our TSA-tree. Each node of TSA-tree contains pre-computed trends and surprises at different levels. Wavelet transform is used recursively to construct TSA nodes. As a result, each node of TSA tree is readily available for visualization of trends and surprises. In addition, the size of each node is significantly smaller than that of the original time series, resulting in faster I/O operations. However, a limitation of TSA-tree is that its size is larger than the original time series. To address this shortcoming, first we prove that the storage space required to store the optimal subtree of TSA-tree (OTSA-tree) is no more than that required to store the original time-series without losing any information. Next, we propose two alternative techniques to reduce the size of OTSA-tree even further, while maintaining an acceptable query precision as compared to querying the original time sequences. Utilizing real and synthetic time-sequence databases, we compare our techniques with some well-known algorithms such as DFT and SVD in both performance and query precision. The results indicate the superiority of our approach. Finally, we show that our techniques are scalable as we increase either the database size or the length of time sequences.

References

[1]
P. R. A. Arning, R. Agrawal. A linear method for deviation detection in large databases. KDD, pages 164-169, 1996.
[2]
R. Agrawal, C. Faloutsos, and A. Swami. Efficient similarity search in sequence database. Fourth International Conference on Foundations of Data Organization and Algorithm, 1993.
[3]
R. Agrawal, K.I. Lin, H. S. Sawhney, and K. Sim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-series Database. VLDB, 1995.
[4]
V. Barnett and T. Lewis. Outliers in Statistical Data. Wiley & Sons, Chichester, New York, 3rd edition, 1994.
[5]
C. S. Burrus, R. A. Gopinath, and H. Guo. Introduction to wavelets and wavelet transforms: a primer. Prentice Hall, Upper Saddle River, N.J., 1998.
[6]
C. Faloutsos, M. Ranganthan, and Y. Manolopoulos. Fast Subsequence Matching in Time-series Datebase. SIGMOD, Proc. of Annual Conference, Minneapolis, 1994.
[7]
K. Chan and A. W. Fu. Efficient time series matching by wavelets. ICDE, pages 126-133, 1999.
[8]
C. Chatfield. The Analysis of Time Series: an Introduction. Chapman and Hall, London, 5th edition, 1996.
[9]
C. K. Chui. Wavelets: a tutorial in theory and applications. Academic Press, Boston, 1992.
[10]
C. K. Chui. An overview of wavelets. In Approximation Theory and Functional Analysis. Academic Press, Boston, 1993.
[11]
T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to algorithms. The MIT Press, Cambridge, Mass, 1989.
[12]
I. Daubechies. Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 41(7):909-996, October 1988.
[13]
D. Rafiei and A. Mendelzon. Similarity-Based Query for Time Series Data. SIGMOD, July 1996.
[14]
W. A. Fuller. Introduction to statistical time series. J. Wiley series in probability and statistics, New York, 2nd edition, 1996.
[15]
A. Haar. Zur theorie der orthogonalen funktionensysteme. Mathematics Annal., 69:331-371, 1910.
[16]
R. Hyndman. http://wwwpersonal.buseco.monash.edu.au/hyndman/tsdl/.
[17]
M. Kendall. Time Series. Griffin, London, 2nd edition, 1976.
[18]
E. M. Knorr and R. T. Ng. Algorithms for mining distance-based outliers in large datasets. Proceedings of the 24th VLDB Conference, pages 392-403, 1998.
[19]
E. M. Knorr and R. T. Ng. Finding intensional knowledge of distance-based outliers. Proceedings of the 25th VLDB Conference, pages 211-222, 1999.
[20]
F. Korn, H. V. Jagadish, and C. Faloutsos. Efficiently supporting ad hoc queries in large datasets of time sequences. Proceedings of the ACM SIGMOD international conference on Management of data, 26(2):289-300, 1997.
[21]
C. Li, P. S. Yu, and V. Castelli. Malm: A framework for mining sequence database at multiple abstraction levels. In Proceedings of the 1998 ACM 7th international conference on Information and knowledge management, pages 267-272, 1998.
[22]
B. A. Logue. http://www.mdshistoricalquotes.com/.
[23]
S. Mallat. A theory for multiresolution signal decomposition The wavelet representation. IEEE transactions on pattern Analysis and Machine Intelligent, 11 (7):674-693, July 1989.
[24]
Y. Qu, C. Wang, and X. S. Wang. Supporting fast search in time series for movement patterns in multiple scales. In Proceedings of the 1998 ACM 7th international conference on Information and knowledge management, pages 251-258, 1998.
[25]
D. Rafiei and A. Mendelzon. Similarity-based queries for time series data. SIGMOD, pages 13-24, 1997.
[26]
R. Agrawal, G. Psaila, E. L. Wimmers, and M. Zait. Querying Shapes of Histories. VLDB, September 1995.
[27]
R.D. Edwards and J. Magee. Technical analysis of stock trends. John Magee, Springfield, Massachsetts, 1969.
[28]
H. Shatkay and S. Zdonik. Approximate queries and representations for large data sequence. ICDE, 1996.
[29]
W. Sweldens. The lifting scheme: A new philosophy in biorthogonal wavelet construction. In A. F. Laine and M. Unser, editors, Wavelet Applications in Signal and Image Processing III, pages 68-79, 1995.
[30]
W. Sweldens. The lifting scheme: A construction of second generation wavelets. SIAM J. Math. Anal., 1997.

Cited By

View all
  • (2018)Enhancing Quality of Coverage for Target Coverage Problem Using Discrete Haar WaveletWireless Personal Communications: An International Journal10.1007/s11277-018-5792-4101:4(1817-1837)Online publication date: 1-Aug-2018
  • (2018)Anomaly detection using piecewise aggregate approximation in the amplitude domainApplied Intelligence10.1007/s10489-017-1017-x48:5(1097-1110)Online publication date: 1-May-2018
  • (2015)A time-series compression technique and its application to the smart gridThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-014-0368-824:2(193-218)Online publication date: 1-Apr-2015
  • Show More Cited By
  1. TSA-Tree: A Wavelet-Based Approach to Improve the Efficiency of Multi-Level Surprise and Trend Queries on Time-Series Data

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    SSDBM '00: Proceedings of the 12th International Conference on Scientific and Statistical Database Management
    July 2000
    ISBN:0769506860

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 26 July 2000

    Author Tags

    1. data mining
    2. outlier
    3. surprise mining
    4. time series
    5. trend mining
    6. wavelet

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 19 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Enhancing Quality of Coverage for Target Coverage Problem Using Discrete Haar WaveletWireless Personal Communications: An International Journal10.1007/s11277-018-5792-4101:4(1817-1837)Online publication date: 1-Aug-2018
    • (2018)Anomaly detection using piecewise aggregate approximation in the amplitude domainApplied Intelligence10.1007/s10489-017-1017-x48:5(1097-1110)Online publication date: 1-May-2018
    • (2015)A time-series compression technique and its application to the smart gridThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-014-0368-824:2(193-218)Online publication date: 1-Apr-2015
    • (2014)An approach to dimensionality reduction in time seriesInformation Sciences: an International Journal10.1016/j.ins.2013.10.037260(15-36)Online publication date: 1-Mar-2014
    • (2014)Adaptive fuzzy clustering based anomaly data detection in energy system of steel industryInformation Sciences: an International Journal10.1016/j.ins.2013.05.018259(335-345)Online publication date: 1-Feb-2014
    • (2013)Improving Space Localization Properties of the Discrete Wavelet TransformInformatica10.5555/2699244.269925324:4(657-675)Online publication date: 1-Oct-2013
    • (2012)Periodic pattern analysis of non-uniformly sampled stock market dataIntelligent Data Analysis10.5555/2595532.259554216:6(993-1011)Online publication date: 1-Nov-2012
    • (2011)Discrete wavelet transform-based time series analysis and miningACM Computing Surveys10.1145/1883612.188361343:2(1-37)Online publication date: 4-Feb-2011
    • (2011)A review on time series data miningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2010.09.00724:1(164-181)Online publication date: 1-Feb-2011
    • (2010)A real time hybrid pattern matching scheme for stock time seriesProceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 10410.5555/1862242.1862263(161-170)Online publication date: 1-Jan-2010
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media