Hierarchical clustering of unequal-length time series with area-based shape distance

Xiao Wang ORCID: orcid.org/0000-0002-0663-6865¹,
Fusheng Yu²,
Witold Pedrycz³ &
…
Jiayin Wang²

1376 Accesses
17 Citations
Explore all metrics

Abstract

Time-series clustering algorithms have been used in a variety of areas to extract valuable information from complex and massive data sets. However, these algorithms suffer from two shortcomings. On the one hand, most of them are designed for the equal-length time series, while clustering of unequal-length time series is often encountered in real-world problems. On the other hand, commonly used distance measures of time series cannot fully reveal trend differences. To overcome these two shortcomings, this paper focuses on the trend of time series and employs the area-based shape distance to measure their similarity. In addition, we present a new hierarchical clustering for unequal-length time series based on area-based shape distance measure. A series of experiments illustrates the performance of the proposed clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Equivalence partition based morphological similarity clustering for large-scale time series

Article Open access 11 April 2023

Tiered Clustering for Time Series Data

TSX-Means: An Optimal K Search Approach for Time Series Clustering

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Aghabozorgi S, Shirkhorshidi A, Wah T (2015) Time-series clustering-a decade review. Inf Syst 53:16–38
Article Google Scholar
Bagnall A, Janacek G (2005) Clustering time series with clipped data. Mach Learn 58(2–3):151–178
Article MATH Google Scholar
Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. KDD Workshop Seattle 10:359–370
Google Scholar
Caiado J, Crato N, Peña D (2009) Comparison of times series with unequal length in the frequency domain. Commun Stat Simul Comput 38:527–540
Article MathSciNet MATH Google Scholar
Camacho M, Perez-Quiro G, Saiz L (2006) Are European business cycles close enough to be just one? J Econ Dyn Control 30(9–10):1687–1706
Article MATH Google Scholar
Cao D, Tian Y, Bai D (2015) Time series clustering method based on principal component analysis. In 5th International conference on information engineering for mechanics and materials, pp 888–895
Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015a) The UCR time series classification archive. http://www.cs.ucr.edu/~eamonn/time_series_data. Accessed 25 Nov 2017
Chen Z, Zuo W, Hu Q, Lin L (2015b) Kernel sparse representation for time series classification. Inf Sci 292:15–26
Article MathSciNet MATH Google Scholar
Dai D, Mu D (2012) A fast approach to \(K\)-means clustering for time series based on symbolic representation. Int J Adv Comput Technol 4(5):233–239
MathSciNet Google Scholar
Dias J, Vermunt J, Ramos S (2015) Clustering financial time series: new insights from an extended hidden Markov model. Eur J Oper Res 243(3):852–864
Article MATH Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
Article MathSciNet MATH Google Scholar
Górecki T (2014) Using derivatives in a longest common subsequence dissimilarity measure for time series classification. Pattern Recognit Lett 45(1):99–105
Article Google Scholar
http://archive.ics.uci.edu/ml/datasets.html. Accessed 29 Nov 2017
Izakian H, Pedrycz W, Jamal I (2015) Fuzzy clustering of time series data using dynamic time warping distance. Eng Appl Artif Intell 39:235–244
Article Google Scholar
Keogh E, Lin J (2005) Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowl Inf Syst 8:154–177
Article Google Scholar
Keogh E, Pazzani M (2001) Derivative dynamic time warping, In: Proceedings of the SIAM international conference on data mining, Chicago, pp 5–7
Kim S, Koh K, Boyd S, Gorinevsky D (2009) \(l_{1}\) trend filtering. SIAM Rev 51(2):339–360
Article MathSciNet MATH Google Scholar
Kini V, Sekhar C (2009) Bayesian mixture of AR models for time series clustering. Formal Pattern Anal Appl 16(2):35–38
MathSciNet Google Scholar
Košmelj K, Batagelj V (1990) Cross-sectional approach for clustering time varying data. J Classif 7:99–109
Article MathSciNet Google Scholar
Lai C, Chung P, Tseng V (2010) A novel two-level clustering method for time series data analysis. Expert Syst Appl 37(9):6319–6326
Article Google Scholar
Liang J, Zhao X, Li D, Cao F, Dang C (2012) Determining the number of clusters using information entropy for mixed data. Pattern Recognit 45(6):2251–2265
Article MATH Google Scholar
Liao T (2005) Clustering of time series data-a survey. Pattern Recognit 38(11):1857–1874
Article MATH Google Scholar
Łuczak M (2016) Hierarchical clustering of time series data with parametric derivative dynamic time warping. Expert Syst Appl 62:116–130
Article Google Scholar
Mori U, Mendiburu A, Lozano J (2015) Similarity measure selection for clustering time series databases. IEEE Trans Knowl Data Eng 28(1):181–195
Article Google Scholar
Nguyen H, Mclachlan G, Orban P, Bellec P, Janke A (2017) Maximum pseudolikelihood estimation for model-based clustering of time series data. Neural Comput 29(4):990–1020
Article MathSciNet MATH Google Scholar
Nieto-Barajas L, Contreras-Cristán A (2014) A Bayesian nonparametric approach for time series clustering. Bayesian Anal 9(1):147–170
Article MathSciNet MATH Google Scholar
Qiu X, Zhang L, Suganthan P, Amaratunga G (2017) Oblique random forest ensemble via least square estimation for time series forecasting. Inf Sci 420:249–262
Article Google Scholar
Rosset S, Zhu J (2007) Piecewise linear regularized solution paths. Inst Math Stat 35(3):1012–1030
MathSciNet MATH Google Scholar
Roy A (2016) A novel multivariate fuzzy time series based forecasting algorithm incorporating the effect of clustering on prediction. Soft Comput 20(5):1991–2019
Article Google Scholar
Sedano J, Sedano J, Camara M, Prieto C (2016) Gene clustering for time-series microarray with production outputs. Soft Comput 20(11):4301–4312
Article Google Scholar
Silva D, Giusti R, Keogh E, Batista G (2018) Speeding up similarity search under dynamic time warping by pruning unpromising alignments. Data Min Knowl Discov. https://doi.org/10.1007/s10618-018-0557-y
Article MathSciNet MATH Google Scholar
Troncoso A, Arias M, Riquelme JC (2015) A multi-scale smoothing kernel for measuring time-series similarity. Neurocomputing 167:8–17
Article Google Scholar
Wang X, Yu F, Zhang H, Liu S, Wang J (2015) Large-scale time series clustering based on fuzzy granulation and collaboration. Int J Intell Syst 30(6):763–780
Article Google Scholar
Wang X, Yu F, Pedrycz W (2016) An area-based shape distance measure of time series. Appl Soft Comput 48:650–659
Article Google Scholar
Wei L, Jiang J (2010) A hidden Markov model-based K-means time series clustering algorithm. In: IEEE international conference on intelligent computing & intelligent systems, pp 135–138
Xiong Y, Yeung D (2004) Time series clustering with ARMA mixtures. Pattern Recognit 37(8):1675–1689
Article MATH Google Scholar
Yu H, Liu Z, Wang G (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55(1):101–115
Article MathSciNet MATH Google Scholar
Yu F, Dong K, Chen F, Jiang Y, Zeng W (2007) Clustering time series with granular dynamic time warping method. In: IEEE international conference on granular computing, San Jose, CA, pp 393–398
Zhang Y, Mańdziuk J, Chai H, Goh B (2017) Curvature-based method for determining the number of clusters. Inf Sci 415–416:414–428
Article Google Scholar

Download references

Acknowledgements

This work was funded by National Natural Science Foundation of China (Nos. 11701338, 11571001), Natural Science Foundation of Shandong Province (No. ZR2016AP12), and a Project of Shandong Province Higher Educational Science and Technology Program (No. J17KB124).

Author information

Authors and Affiliations

School of Information Engineering, Shandong Youth University of Political Science, Jinan, 250103, China
Xiao Wang
School of Mathematical Sciences, Laboratory of Mathematics and Complex Systems, Ministry of Education, Beijing Normal University, Beijing, 100875, China
Fusheng Yu & Jiayin Wang
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, T6R 2V4, Canada
Witold Pedrycz

Authors

Xiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fusheng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Witold Pedrycz
View author publications
You can also search for this author in PubMed Google Scholar
Jiayin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fusheng Yu.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Yu, F., Pedrycz, W. et al. Hierarchical clustering of unequal-length time series with area-based shape distance. Soft Comput 23, 6331–6343 (2019). https://doi.org/10.1007/s00500-018-3287-6

Download citation

Published: 09 June 2018
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s00500-018-3287-6

Hierarchical clustering of unequal-length time series with area-based shape distance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Equivalence partition based morphological similarity clustering for large-scale time series

Tiered Clustering for Time Series Data

TSX-Means: An Optimal K Search Approach for Time Series Clustering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Hierarchical clustering of unequal-length time series with area-based shape distance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Equivalence partition based morphological similarity clustering for large-scale time series

Tiered Clustering for Time Series Data

TSX-Means: An Optimal K Search Approach for Time Series Clustering

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation