Abstract
Piecewise Aggregate Approximation (PAA) is a competitive basic dimension reduction method for high-dimensional time series mining. When deployed, however, the limitations are obvious that some important information will be missed, especially the trend. In this paper, we propose two new approaches for time series that utilize approximate trend feature information. Our first method is based on relative mean value of each segment to record the trend, which divide each segment into two parts and use the numerical average respectively to represent the trend. We proved that this method satisfies lower bound which guarantee no false dismissals. Our second method uses a binary string to record the trend which is also relative to mean in each segment. Our methods are applied on similarity measurement in classification and anomaly detection, the experimental results show the improvement of accuracy and effectiveness by extracting the trend feature suitably.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: identifying density-based local outliers. In: ACM SIGMOD International Conference on Management of Data, pp. 93–104 (2000)
Cantrell, C.D.: Modern mathematical methods for physicists and engineers. Measur. Sci. Technol. 12(12), 2211 (2001)
Chan, K.P., Fu, W.C.: Efficient time series matching by wavelets. In: 1999 Proceedings of International Conference on Data Engineering, pp. 126–133 (1999)
Chen, Y., et al.: The UCR time series classification archive, July 2015. www.cs.ucr.edu/eamonn/time_series_data/
Chomboon, K., Chujai, P., Teerarassammee, P., Kerdprasop, K., Kerdprasop, N.: An empirical study of distance metrics for k-nearest neighbor algorithm. In: International Conference on Industrial Application Engineering, pp. 280–285 (2015)
Dersch, D.R., Dersch, D.R., Leinsinger, G.L., Hahn, K., Auer, D.: Cluster analysis of biomedical image time-series. Int. J. Comput. Vis. 46(2), 103–128 (2002)
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: International Conference on Management of Data, vol. 23, no. 2, pp. 419–429 (1994)
Guo, C., Li, H., Pan, D.: An improved piecewise aggregate approximation based on statistical features for time series mining. In: Bi, Y., Williams, M.-A. (eds.) KSEM 2010. LNCS (LNAI), vol. 6291, pp. 234–244. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15280-1_23
Himberg, J., HyvÃrinen, A., Esposito, F.: Validating the independent components of neuroimaging time series via clustering and visualization. Neuroimage 22(3), 1214–1222 (2004)
Hu, L.Y., Huang, M.W., Ke, S.W., Tsai, C.F.: The distance function effect on k-nearest neighbor classification for medical datasets. Springerplus 5(1), 1304 (2016)
Kahveci, T., Singh, A.: Variable length queries for time series data. In: 2001 Proceedings of International Conference on Data Engineering, p. 273 (2002)
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3), 263–286 (2001)
Landesberger, T.V., Brodkorb, F., Roskosch, P.: Mobilitygraphs: visual analysis of mass mobility dynamics via spatia-temporal graphs and clustering. IEEE Trans. Vis. Comput. Graph. 22(1), 11–20 (2016)
Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 2–11 (2003)
Paparrizos, J., Gravano, L.: k-Shape: efficient and accurate clustering of time series. ACM SIGMOD Rec. 45, 69–76 (2016)
Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition, vol. 1, pp. 353–356. Prentice-Hall, Inc., Upper Saddle River (1993)
Rodriguez, A.C., Mozos, M.R.D.L.: Improving network security through traffic log anomaly detection using time series analysis. In: Herrero, Á., Corchado, E., Redondo, C., Alonso, Á. (eds.) Computational Intelligence in Security for Information Systems 2010. Advances in Intelligent and Soft Computing, vol. 85, pp. 125–133. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16626-6_14
Rui, N., Horta, N.: A new SAX-GA methodology applied to investment strategies optimization. In: Conference on Genetic and Evolutionary Computation, pp. 1055–1062 (2012)
Shokoohi-Yekta, M., Chen, Y., Campana, B., Hu, B., Zakaria, J., Keogh, E.: Discovery of meaningful rules in time series. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1085–1094 (2015)
Rhea, S., Wang, E., Wong, E., Atkins E., Storer, N.: Littletable: a time-series database and its uses. In: ACM International Conference on Management of Data, pp. 125–138 (2017)
Sun, Y., Li, J., Liu, J., Sun, B., Chow, C.: An improvement of symbolic aggregate approximation distance measure for time series. Neurocomputing 138(11), 189–198 (2014)
Xi, X., Keogh, E., Shelton, C., Wei, L., Ratanamahatana, C.A.: Fast time series classification using numerosity reduction. In: International Conference, pp. 1033–1040 (2006)
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary LP norms. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 385–394 (2000)
Yong, Z., Tan, X., Xi, H.: A novel approach to network security situation awareness based on multi-perspective analysis. In: International Conference on Computational Intelligence and Security, pp. 768–772 (2007)
Yu, Q., Jibin, L., Jiang, L.: An improved arima-based traffic anomaly detection algorithm for wireless sensor networks. Int. J. Distrib. Sensor Netw. 2016, 1–9 (2016)
Zhang, C., Yin, A., Liu, H., Zhang, J.: Design and application of electrocardiograph diagnosis system based on multifractal theory. In: Sun, G., Liu, S. (eds.) ADHIP 2017. LNICST, vol. 219, pp. 433–447. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73317-3_50
Zhang, C., Yin, A., Deng, Y., Tian, P., Wang, X., Dong, L.: A novel anomaly detection algorithm based on trident tree. In: Luo, M., Zhang, L.-J. (eds.) CLOUD 2018. LNCS, vol. 10967, pp. 295–306. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94295-7_20
Acknowledgment
This study is supported by the Shenzhen Research Council (Grant No. JSGG2017-0822160842949, JCYJ20170307151518535).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, C. et al. (2018). An Improvement of PAA on Trend-Based Approximation for Time Series. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11335. Springer, Cham. https://doi.org/10.1007/978-3-030-05054-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-05054-2_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05053-5
Online ISBN: 978-3-030-05054-2
eBook Packages: Computer ScienceComputer Science (R0)