Abstract
This work motivates the need for more flexible structural similarity measures between time-series sequences, which are based on the extraction of important periodic features. Specifically, we present non-parametric methods for accurate periodicity detection and we introduce new periodic distance measures for time-series sequences. We combine these new measures with an effective metric tree index structure for efficiently answering k-Nearest-Neighbor queries. The goal of these tools and techniques are to assist in detecting, monitoring and visualizing structural periodic changes. It is our belief that these methods can be directly applicable in the manufacturing industry for preventive maintenance and in the medical sciences for accurate classification and anomaly detection.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Due to the assumption of the Fourier Transform that the data is periodic, proper windowing of the data might be necessary for achieving a more accurate harmonic analysis. In order to enhance the flow of the paper, we will not go into describing windowing techniques, but direct the interested reader to Harris (1978) for an excellent review.
References
Agrawal, R., Faloutsos, C., and Swami, A. 1993. Efficient similarity search in sequence databases. In Proc. of the 4th FODO, pp. 69–84.
Beckmann, N., Kriegel, H.-P., Schneider, R., and Seeger, B. 1990. The R*-tree: An efficient and robust access method for points and rectangles. In Proc. of ACM SIGMOD.
Bowerman, B. and O’Connell, R. 2000 Forecasting and Time Series: An Applied Approach. Duxbury Press.
Chiueh, T. 1994. Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the 20th VLDB.
Ciaccia, P., Patella, M., and Zezula, P. 1997. M-tree: An efficient access method for similarity search in metric spaces. In Proc. of 23rd VLDB, pp. 426–435.
Ebersbach, G., Sojer, M., Valldeoriola, F., Wissel, J., Müller, J., Tolosa, E., and Poewe, W. 1999. Comparative analysis of gait in Parkinson's disease, cerebellar ataxia and subcortical arteriosclerotic encephalopathy. Brain, 122(7):1349–1355.
Elfeky, M.G., Aref, W.G., and Elmagarmid, A.K. 2004. Using convolution to mine obscure periodic patterns in one pass. In Proc. of EDBT.
Ergün, F., Muthukrishnan, S., and Sahinalp, S.C. 2004. Sublinear methods for detecting periodic trendsin data streams. In LATIN.
Friss-Cristensen, E. and Lassen, K. 1991. Length of solar cycle---An indicator of solar-activity closely related with climate. In Science, 254:698–700.
Fu, A., Chan, P., Cheung, Y.-L., and Moon, Y.S. 2000. Dynamic VP-Tree Indexing for N-Nearest Neighbor Search Given Pair-Wise Distances. The VLDB Journal.
Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD, pp. 47–57.
Harris, F.J. 1978. On the use of windows for harmonic analysis with the discrete fourier transform. In Proc. of the IEEE, 66(1).
Idé, T. and Inoue, K. 2004. Knowledge discovery from time-series data using nonlinear transformations. In Proc. of the 4th Data Mining Workshop (Japan Soc. for Software Science and Technology.
Kahn, M., Mackisack, M., Osborne, M., and Smyth, G. 1992. On the consistency of prony's method and related algorithms. Journal of Statistical Computation and Simulation, 49:111–124.
Kalpakis, K., Gada, D., and Puttagunta, V. 2001. Distance measures for elective clustering of ARIMA time-series. In Proc. of IEEE International Conference on Data Mining (ICDM), pp. 273–280.
Keogh, E. 2002. Exact indexing of dynamic time warping. In Proc. of VLDB.
Keogh, E. and Kasetty, S. 2002. On the need for time series data mining benchmarks: A survey and empirical demonstration. In Proc. of SIGKDD.
Keogh, E., Chu, S., Hart, D., and Pazzani, M. 2001. An online algorithm for segmenting time series. In Proc. of ICDM.
Keogh, E., Lonardi, S., and Ratanamahatana, A. 2004. Towards parameter-free data mining. In Proc. of SIGKDD.
Kruskal, J. and Wish, M. 1978. Multidimensional Scaling. Sage Publications.
Marple, L. 1987. Digital Spectral Analysis. Prentice Hall.
Mitchell, J.S. 1993. An Introduction to Machinery Analysis and Monitoring. PennWell Publ. Co.
Oppenheim, A. Willsky, A. and Nawab, S. 1997. Signals and Systems, 2nd edition. Prentice Hall.
Papoulis, A. 1977. Signal Analysis. McGraw-Hill.
Tenenbaum, J.B., de Silva, V., and Langford, J.C. 2000. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323.
Tulena, J., Azzolini, M., de Vriesa, J., Groeneveld, W.H., Passchier, J., and van de Wetering, B. 1999. Quantitative study of spontaneous eye blinks and eye tics in Gilles de la Tourette's syndrome. In Journal of Neurol. Neurosurg. Psychiatry 67:800–802.
Vlachos, M., Meek, C., Vagena, Z., and Gunopulos, D. 2004. Identification of similarities, periodicities & bursts for online search queries. In Proc. of SIGMOD.
Yianilos, P. 1992. Data Structures and algorithms for nearest neighbor search in general metric spaces. In Proc. of 3rd SIAM on Discrete Algorithms.
Zhu, Y. and Shasha, D. 2002. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB.
Acknowledgments
We are thankful to MSN and Microsoft for letting us use a portion of the MSN query logs. We would like also to thank Zografoula Vagena for her help in the experimental section. Finally, we acknowledge the useful comments of the anonymous reviewers, that have helped us present this work in a more constructive and complete way.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Vlachos, M., Yu, P.S., Castelli, V. et al. Structural Periodic Measures for Time-Series Data. Data Min Knowl Disc 12, 1–28 (2006). https://doi.org/10.1007/s10618-005-0016-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-005-0016-4