[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Temporal data mining using shape space representations of time series

Published: 01 December 2010 Publication History

Abstract

Subspace representations that preserve essential information of high-dimensional data may be advantageous for many reasons such as improved interpretability, overfitting avoidance, acceleration of machine learning techniques. In this article, we describe a new subspace representation of time series which we call polynomial shape space representation. This representation consists of optimal (in a least-squares sense) estimators of trend aspects of a time series such as average, slope, curve, change of curve, etc. The shape space representation of time series allows for a definition of a novel similarity measure for time series which we call shape space distance measure. Depending on the application, time series segmentation techniques can be applied to obtain a piecewise shape space representation of the time series in subsequent segments. In this article, we investigate the properties of the polynomial shape space representation and the shape space distance measure by means of some benchmark time series and discuss possible application scenarios in the field of temporal data mining.

References

[1]
R. Agrawal, K. Lin, H. Sawhney, K. Shim, Fast similarity search in the presence of noise, scaling, and translation in time-series databases, in: Proceedings of the 21st International Conference on Very Large Data Bases (VLDB '95), Atlantic City, NJ, USA, 1995, pp. 490-501.
[2]
C. Antunes, A. Oliveira, Temporal data mining: an overview, in: Proceedings of Workshop on Temporal Data Mining (KDD'01), San Francisco, CA, USA, 2001, pp. 1-13.
[3]
Y. Araki, D. Arita, R. Ichiro Taniguchi, Motion motif extraction from high-dimensional motion information, in: Proceedings of the 12th Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV), Tokushima, Japan, 2006, pp. 92-97.
[4]
N. Beckmann, H. Kriegel, R. Schneider, B. Seeger, The R*-tree: an efficient and robust access method for points and rectangles, in: Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (SIGMOD '90), Atlantic City, NJ, USA, 1990, pp. 322-331.
[5]
Bellman, R., On the approximation of curves by line segments using dynamic programming. Communications of the ACM. v4 i6. 284
[6]
Bishop, C.M., Pattern Recognition and Machine Learning. 2006. Springer, New York, NY, USA.
[7]
Brigham, E., Fast Fourier Transform and its Applications. 1988. Prentice-Hall, Englewood Cliffs, NJ, USA.
[8]
Burges, C.J.C., A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery. v2 i2. 121-167.
[9]
Cevikalp, H., Yavuzand, H.S., Cay, M.A. and Barkana, A., Two-dimensional subspace classifiers for face recognition. Neurocomputing. v72. 1111-1120.
[10]
C.-C. Chang, C.-J. Lin, LIBSVM-a library for support vector machines, visited on July 8, 2009 (2007). URL: {http://www.csie.ntu.edu.tw/~cjlin/libsvm/}.
[11]
B. Chiu, E. Keogh, S. Lonardi, Probabilistic discovery of time series motifs, in: Proceedings of the 9th International Conference on Knowledge Discovery and Data Mining (KDD), Washington, DC, USA, 2003, pp. 493-498.
[12]
Cristianini, N. and Shawe-Taylor, J., An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. 2000. Cambridge University Press, New York, NY, USA.
[13]
G. Das, K.-I. Lin, H. Mannila, G. Renganathan, P. Smyth, Rule discovery from time series, in: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD), New York, NY, USA, 1998, pp. 16-22.
[14]
I. Daubechies, Ten Lectures on Wavelets, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 1992.
[15]
De La Torre, F. and Black, M.J., A framework for robust subspace learning. International Journal of Computer Vision. v54 i1-3. 117-142.
[16]
H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, E. Keogh, Querying and mining of time series data: experimental comparison of representations and distance measures, in: Proceedings of the 34th International Conference on Very Large Data Bases (VLDB08), Auckland, New Zealand, 2008, pp. 1542-1552.
[17]
Duda, R.O., Hart, P.E. and Stork, D.G., Pattern Classification. 2001. second ed. Wiley-Interscience, New York, NY.
[18]
Elhay, S., Golub, G.H. and Kautsky, J., Updating and downdating of orthogonal polynomials with data fitting applications. SIAM Journal on Matrix Analysis and Applications. v12 i2. 327-353.
[19]
Fassetti, F., Greco, G. and Terracina, G., Mining loosely structured motifs from biological data. IEEE Transactions on Knowledge and Data Engineering. v20 i11. 1472-1489.
[20]
T.-C. Fu, F.-L. Chung, R. Luk, C.-M. Ng, Preventing meaningless stock time series pattern discovery by changing perceptually important point detection, in: Lecture Notes in Computer Science, vol. 3613, Springer, Berlin, Heidelberg, Germany, 2005, pp. 1171-1174 (Chapter 146).
[21]
Fu, Y. and Huang, T.S., Human age estimation with regression on discriminative aging manifold. IEEE Transactions on Multimedia. v10 i4. 578-584.
[22]
Fuchs, E., On discrete polynomial least-squares approximation in moving time windows. In: Gautschi, W., Golub, G., Opfer, G. (Eds.), Applications and Computation of Orthogonal Polynomials, International Series of Numerical Mathematics, No. 131, Birkhäuser, Basel, Switzerland. pp. 93-107.
[23]
E. Fuchs, Schnelle quadratmittelapproximation in gleitenden Zeitfenstern mit diskreten orthogonalen Polynomen, Berichte aus der Informatik, Shaker Verlag, Aachen, Germany, 2000 (Ph.D. dissertation, University of Passau, 1999).
[24]
Fuchs, E., Gruber, C., Reitmaier, T. and Sick, B., Processing short-term and long-term information with a combination of polynomial approximation techniques and time-delay neural networks. IEEE Transactions on Neural Networks. v20 i9. 1450-1462.
[25]
E. Fuchs, T. Gruber, J. Nitschke, B. Sick, On-line segmentation of time series based on polynomial least-squares approximations, IEEE Transactions on Pattern Analysis and Machine Intelligence. URL: {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2010.44}.
[26]
Fuchs, E., Gruber, T., Nitschke, J. and Sick, B., On-line motif detection in time series with SwiftMotif. Pattern Recognition. v42 i11. 3015-3031.
[27]
Gao, X. and Tian, C., Multi-view face recognition based on tensor subspace analysis and view manifold modeling. Neurocomputing. v72. 3742-3750.
[28]
P. Geurts, CBF problem database, visited on July 8, 2009 (2002). URL: {http://www.montefiore.ulg.ac.be/~geurts/thesis.html}.
[29]
C. Gruber, T. Gruber, B. Sick, Online signature verification with new time series kernels for support vector machines, in: Proceedings of the IAPR International Conference on Biometrics, Hong Kong, China, 2006, pp. 500-508.
[30]
A. Guttman, R-trees: a dynamic index structure for spatial searching, Readings in Database Systems, 1988, pp. 599-609.
[31]
X.-H. Han, Y.-W. Chen, T. Sukegawa, A supervised nonlinear neighborhood embedding of color histogram for image indexing, in: Proceedings of the 15th IEEE International Conference on Image Processing (ICIP 2008), San Diego, CA, USA, 2008, pp. 949-952.
[32]
S. Hettich, S.D. Bay, UCI KDD archive, visited on September 2, 2009 (1999). URL: {http://kdd.ics.uci.edu}.
[33]
K. Honda, I. Hidetomo, A. Notsu, A sequential learning algorithm for collaborative filtering with linear fuzzy clustering, in: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC 2006), vol. 2, Taipei, Taiwan, 2006, pp. 1056-1061.
[34]
E. Keogh, Time series tutorials, visited on July 20, 2009. URL: {http://www.cs.ucr.edu/~eamonn/tutorials.html}.
[35]
Keogh, E., Chakrabarti, K., Pazzani, M. and Mehrotra, S., Locally adaptive dimensionality reduction for indexing large time series databases. SIGMOD Records. v30 i2. 151-162.
[36]
E. Keogh, S. Chu, D. Hart, M. Pazzani, An online algorithm for segmenting time series, in: Proceedings of the 2001 IEEE International Conference on Data Mining (ICDM), San Jose, CA, USA, 2001, pp. 289-296.
[37]
Keogh, E. and Kasetty, S., On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Mining and Knowledge Discovery. v7 i4. 349-371.
[38]
E. Keogh, J. Lin, W. Truppel, Clustering of time series subsequences is meaningless: Implications for previous and future research, in: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM), Washington, DC, USA, 2003, pp. 115-122.
[39]
E. Keogh, M. Pazzani, An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback, in: Proceedings of the 4th International Conference of Knowledge Discovery and Data Mining (KDD), New York, NY, USA, 1998, pp. 239-241.
[40]
E. Keogh, X. Xi, L. Wei, C.A. Ratanamahatana, The UCR time series classification/clustering homepage, visited on July 20, 2009 (2006). URL: {www.cs.ucr.edu/~eamonn/time_series_data/}.
[41]
E.J. Keogh, M.J. Pazzani, A simple dimensionality reduction technique for fast similarity search in large time series databases, in: Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications (PAKD), Kyoto, Japan, 2000, pp. 122-133.
[42]
Kononenko, I. and Kukar, M., Machine Learning and Data Mining. 2007. Horwood Publishing, Westergate, Great Britain.
[43]
Laxman, S. and Sastry, P.S., A survey of temporal data mining. Sadhana. v31 i2. 173-198.
[44]
Lee, C.-H., Liu, A. and Chen, W.-S., Pattern discovery of fuzzy time series for financial prediction. IEEE Transactions on Knowledge and Data Engineering. v18 i5. 613-625.
[45]
D. Lemire, A better alternative to piecewise linear time series segmentation, in: Proceedings of the 20th National Conference on Artificial Intelligence (AAAI), Minneapolis, MN, USA, 2007, pp. 545-550.
[46]
Li, Y., On incremental and robust subspace learning. Pattern Recognition. v37. 1509-1518.
[47]
Li, Y., Luo, D. and Liu, S., Orthogonal discriminant linear local tangent space alignment for face recognition. Neurocomputing. v72. 1319-1323.
[48]
J. Lin, E. Keogh, S. Lonardi, P. Patel, Finding motifs in time series, in: Proceedings of the 2nd Workshop on Temporal Data Mining, Edmonton, AB, CA, 2002, pp. 53-68.
[49]
H. Liu, H. Motoda (Eds.), Feature Extraction, Construction, and Selection: A Data Mining Perspective, Kluwer Academic Publishers, Boston, MA, USA, 1998.
[50]
Liu, H. and Motoda, H., Feature Selection for Knowledge Discovery and Data Mining. 1998. Kluwer Academic Publishers, Boston, MA, USA.
[51]
Lorenz, E.N., Deterministic nonperiodic flow. Journal of the Atmospheric Sciences. v20 i2. 130-141.
[52]
Maji, P. and Pal, S., Rough-fuzzy c-medoids algorithm and selection of bio-basis for amino acid sequence analysis. IEEE Transactions on Knowledge and Data Engineering. v19 i6. 859-872.
[53]
S. Manganaris, Supervised classification with temporal data, Ph.D. Thesis, School of Engineering, Vanderbilt University, 1997.
[54]
I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, T. Euler, Yale: rapid prototyping for complex data mining tasks, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'06), Philadelphia, PA, USA, 2006, pp. 935-940.
[55]
Mikut, R., Burmeister, L., Gröll, O. and Reischl, M., Takagi-Sugeno-Kang fuzzy classifiers for a special class of time-varying systems. IEEE Transactions on Fuzzy Systems. v16 i4. 1038-1049.
[56]
D. Minnen, C.L. Isbell, I. Essa, T. Starner, Discovering multivariate motifs using subsequence density estimation and greedy mixture learning, in: Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI), Vancouver, Canada, 2007, pp. 615-620.
[57]
D. Minnen, T. Starner, I. Essa, C. Isbell, Discovering characteristic actions from on-body sensor data, in: Proceedings of the 10th IEEE International Symposium on Wearable Computers (ISWC), Montreux, Switzerland, 2006, pp. 11-18.
[58]
A. Mueen, E. Keogh, Q. Zhu, S. Cash, B. Westover, Exact discovery of time series motifs, in: Proceedings of the SIAM International Conference on Data Mining (SDM), Sparks, NV, USA, 2009, pp. 473-484.
[59]
Nikiforov, A., Suslov, S. and Uvarov, V., Classical Orthogonal Polynomials of a Discrete Variable, Springer Series in Computational Physics. 1991. Springer-Verlag, Berlin, Heidelberg, New York.
[60]
R. Olszewski, ECG and Wafer databases, visited on July 20, 2009 (2001). URL: {http://www.cs.cmu.edu/~bobski/data/data.html}.
[61]
R. Olszewski, Generalized feature extraction for structural pattern recognition in time-series data, Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2001.
[62]
T. Palpanas, M. Vlachos, E. Keogh, D. Gunopulos, W. Truppel, Online amnesic approximation of streaming time series, in: Proceedings of the 20th International Conference on Data Engineering (ICDE), Boston, MA, USA, 2004, pp. 339-349.
[63]
Pham, D. and Chan, A., Control chart pattern recognition using a new type of self-organizing neural network. Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering. v2 i212. 115-127.
[64]
Pratt, K.B. and Fink, E., Search for patterns in compressed time series. International Journal of Image and Graphics. v2 i1. 89-106.
[65]
Roddick, J.F. and Spiliopoulou, M., A survey of temporal knowledge discovery paradigms and methods. IEEE Transactions on Knowledge and Data Engineering. v14 i4. 750-767.
[66]
N. Saito, Local feature extraction and its application using a library of bases, Ph.D. Thesis, Yale University, New Haven, CT, USA, 1994.
[67]
T. Sellis, N. Roussopoulos, C. Faloutsos, The R+-tree: a dynamic index for multi-dimensional objects, in: Proceedings of the 13th International Conference on Very Large Data Bases (VLDB '87), Brighton, Great Britain, 1987, pp. 507-518.
[68]
Y. Tanaka, K. Iwamoto, K. Uehara, Discovery of time series motif from multi-dimensional data based on MDL principle, in: Machine Learning, vol. 58, Springer, 2005, pp. 269-300.
[69]
Y. Tanaka, K. Uehara, Discover motifs in multidimensional time-series using the principal component analysis and the MDL principle, in: Proceedings of the International Conference on Machine Learning and Data Mining (MLDM), Tokushima, Japan, 2003, pp. 252-265.
[70]
Tao, D., Li, X., Wu, X. and Maybank, S., Tensor rank one discriminant analysis-a convergent method for discriminative multilinear subspace selection. Neurocomputing. v71. 1866-1882.
[71]
Tucker, W., A rigorous ODE solver and Smale's 14th problem. Foundations of Computational Mathematics. v2 i1. 53-117.
[72]
Wang, F. and Wang, X., Neighborhood discriminant tensor mapping. Neurocomputing. v72. 2035-2039.
[73]
Wilson, W.O., Birkin, P. and Aickelin, U., The motif tracking algorithm. International Journal of Automation and Computing. v5 i1. 32-44.
[74]
J. Yan, N. Liu, B. Zhang, Q. Yang, S. Yan, Z. Chen, A novel scalable algorithm for supervised subspace learning, in: Proceedings of the Sixth International Conference on Data Mining (ICDM'06), Hong Kong, China, 2006, pp. 721-730.
[75]
Yan, S., Liu, J., Tang, X. and Huang, T.S., A parameter-free framework for general supervised subspace learning. IEEE Transactions on Information Forensics and Security. v2 i1. 69-76.
[76]
Yang, J., Yan, S. and Huang, T.S., Ubiquitously supervised subspace learning. IEEE Transactions on Image Processing. v18 i2. 241-249.
[77]
Yuan, Y., Pang, Y., Pan, J. and Li, X., Scene segmentation based on IPCA for visual surveillance. Neurocomputing. v72 i10-12. 2450-2454.
[78]
X. Zhang, X. Gao, Y. Wang, Microcalcification clusters detection with tensor subspace learning and twin SVM, in: Proceedings of the 7th World Congress on Intelligent Control and Automation, Chongqing, China, 2008, pp. 1758-1763.
[79]
Zhang, X., Wu, J., Yang, X., Ou, H. and Lv, T., A novel pattern extraction method for time series classification. Optimization and Engineering. v10 i2. 253-271.
[80]
Q. Zhao, S. Bhowmick, Sequential pattern mining: a survey, Technical Report, Nanyang Technical University, Singapore, 2003.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing
Neurocomputing  Volume 74, Issue 1-3
December, 2010
499 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 December 2010

Author Tags

  1. Least-squares approximation
  2. Orthogonal polynomials
  3. Polynomial shape space
  4. Similarity measure
  5. Subspace learning
  6. Temporal data mining
  7. Time series

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)A Survey of Similarity Measures for Time stamped Temporal DatasetsInternational Conference on Data Science, E-learning and Information Systems 202110.1145/3460620.3460754(193-197)Online publication date: 5-Apr-2021
  • (2018)Performing event detection in time series with SwiftEventPattern Analysis & Applications10.1007/s10044-017-0657-021:2(543-562)Online publication date: 1-May-2018
  • (2016)Change points detection in crime-related time seriesApplied Soft Computing10.1016/j.asoc.2015.12.00440:C(441-454)Online publication date: 1-Mar-2016
  • (2014)Asynchronism-based principal component analysis for time series data miningExpert Systems with Applications: An International Journal10.1016/j.eswa.2013.10.01941:6(2842-2850)Online publication date: 1-May-2014
  • (2013)A methodological approach to mining and simulating data in complex information systemsIntelligent Data Analysis10.5555/2595588.259559117:5(753-769)Online publication date: 1-Sep-2013
  • (2013)Algebraic segmentation of short nonstationary time series based on evolutionary prediction algorithmsNeurocomputing10.1016/j.neucom.2013.05.013121(354-364)Online publication date: 1-Dec-2013
  • (2012)Time-series data miningACM Computing Surveys10.1145/2379776.237978845:1(1-34)Online publication date: 7-Dec-2012

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media