[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Density peaks clustering based on density voting and neighborhood diffusion

Published: 18 October 2024 Publication History

Abstract

Density Peaks Clustering (DPC) is a well-known clustering technique in the data mining field with fewer parameters as well as no iteration. However, when dealing with datasets containing multiple peaks, DPC may subjectively choose the wrong cluster centers through the decision graph. Additionally, DPC requires a considerable amount of time to estimate density and relative distance. Moreover, DPC is sensitive to the value of cut-off distance. To overcome these issues, a density peaks clustering algorithm based on density voting and neighborhood diffusion (DPC-DVND) is proposed. Firstly, the proposed algorithm utilizes the k nearest neighbors and KD-tree to enhance the efficiency of computing local density and relative distance. Secondly, this study selects the potential cluster centers by density voting and applies the number of votes instead of density to calculate the feasibility of each potential center becoming a cluster center, so that the centers of low-density clusters can be better distinguished. Finally, two neighborhood density diffusion rules are designed to propagate labels and form the core structure of clusters. Experiments on synthetic, real, and image datasets are performed to compare different methods. Results show that DPC-DVND outperforms other state-of-the-art algorithms in terms of effectiveness and efficiency.

References

[1]
R. Maheshwari, A.C. Mishra, S.K. Mohanty, An entropy-based density peak clustering for numerical gene expression datasets, Appl. Soft Comput. 142 (2023).
[2]
Y.Y. Niu, D.T. Kong, L.G. Liu, R. Wen, J.H. Xiao, Overlapping community detection with adaptive density peaks clustering and iterative partition strategy, Expert Syst. Appl. 213 (2023).
[3]
R.F. Mansour, Blockchain assisted clustering with Intrusion Detection System for Industrial Internet of Things environment, Expert Syst. Appl. 207 (2022).
[4]
B. Inje, K.K. Nagwanshi, R.K. Rambola, An efficient document information retrieval using hybrid global search optimization algorithm with density based clustering technique, Cluster Comput 27 (2024) 689–705.
[5]
C.M. Hoang, B. Kang, Pixel-level clustering network for unsupervised image segmentation, Eng. Appl. Artif. Intel. 127 (2024).
[6]
M. Ester, H.P. Kriegel, J. Sander, X. Xu, A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, AAAI Press, 1996.
[7]
J.B. Macqueen, Some Methods for Classification and Analysis of Multivariate Observations, (1966).
[8]
I.C. Gormley, T.B. Murphy, A.E. Raftery, Model-Based Clustering, Annu. Rev. Stat. Appl. 10 (2023) 573–595.
[9]
N. Fu, W.W. Ni, H.B. Hu, S. Zhang, Multidimensional grid-based clustering with local differential privacy, Inform. Sci. 623 (2023) 402–420.
[10]
S. Wang, L.Z. Hao, X.F. Wang, J.H. Guo, Paralinear distance and its algorithm for hierarchical clustering of high-dimensional discrete variables, Int. J. Approx. Reason. 167 (2024).
[11]
L. Hu, Y. Yang, Z.H. Tang, Y.Z. He, X. Luo, FCAN-MOPSO: An improved fuzzy-based graph clustering algorithm for complex networks with multiobjective particle swarm optimization, IEEE Trans. Fuzzy Syst. 31 (2023) 3470–3484.
[12]
L. Colomba, L. Cagliero, P. Garza, Density-Based Clustering by Means of Bridge Point Identification, IEEE Trans. Knowl. Data En. 35 (2023) 11274–11287.
[13]
A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science 344 (2014) 1492–1496.
[14]
Y. Li, L.Y. Sun, Y.C. Tang, DPC-FSC: An approach of fuzzy semantic cells to density peaks clustering, Inform, Sciences 616 (2022) 88–107.
[15]
Y.Q. Yang, J.H. Cai, H.F. Yang, X.J. Zhao, Density clustering with divergence distance and automatic center selection, Inform, Sciences 596 (2022) 414–438.
[16]
J.Y. Guan, S. Li, X.X. He, J.J. Chen, Clustering by fast detection of main density peaks within a peak digraph, Inform, Sciences 628 (2023) 504–521.
[17]
Y.W. Chen, X.L. Hu, W.T. Fan, L.L. Shen, Z. Zhang, X. Liu, J.X. Du, H.B. Li, Y. Chen, H.L. Li, Fast density peak clustering for large scale data based on kNN, Knowl.-Based Syst. 187 (2020).
[18]
H.J. Huang, H. Wu, X.X. Wei, Y.Q. Zhou, Optimization of Density Peak Clustering Algorithm Based on Improved Black Widow Algorithm, Biomimetics-Basel 9 (2024).
[19]
T.F. Gao, D. Chen, Y.B. Tang, B. Du, R. Ranjan, A.Y. Zomaya, S. Dustdar, Adaptive density peaks clustering: Towards exploratory EEG analysis, Knowl.-Based Syst. 240 (2022).
[20]
K.K. Qiao, J.W. Chen, S.K. Duan, Self-adaptive two-stage density clustering method with fuzzy connectivity, Appl. Soft Comput. 154 (2024).
[21]
S.F. Ding, W. Du, X. Xu, T.H. Shi, Y.R. Wang, C. Li, An improved density peaks clustering algorithm based on natural neighbor with a merging strategy, Inform, Sciences 624 (2023) 252–276.
[22]
J. Xie, X. Liu, M. Wang, SFKNN-DPC: Standard deviation weighted distance based density peak clustering algorithm, Inform Sciences 653 (2024).
[23]
L. Sun, X.Y. Qin, W.P. Ding, J.C. Xu, Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy, Neurocomputing 473 (2022) 159–181.
[24]
J. Zhao, G. Wang, J.S. Pan, T.H. Fan, I.V. Lee, Density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets, Pattern Recogn. 139 (2023).
[25]
W.J. Guo, W.H. Wang, S.P. Zhao, Y.L. Niu, Z.Y. Zhang, X.G. Liu, Density Peak Clustering with connectivity estimation, Knowl.-Based Syst. 243 (2022).
[26]
M. Abbas, A. El-Zoghabi, A. Shoukry, DenMune: Density peak based clustering using mutual nearest neighbors, Pattern Recogn. 109 (2021).
[27]
L.M. Guo, W.J. Qin, Z. Cai, X. Su, Hybrid Clustering Algorithm Based on Improved Density Peak Clustering, Appl Sci-Basel 14 (2024).
[28]
Z.C. Shi, R.Z. Guo, Z.G. Zhao, An improved hierarchical clustering method based on the k-NN and density peak clustering, T Gis 27 (2023) 2197–2212.
[29]
M.S. Han, J.S. Lee, Graph-based density peak merging for identifying multi-peak clusters, Appl. Soft Comput. 146 (2023).
[30]
Z.G. Long, Y. Gao, H. Meng, Y.Q. Yao, T.R. Li, Clustering based on local density peaks and graph cut, Inform, Sciences 600 (2022) 263–286.
[31]
X. Xu, S.F. Ding, Y.R. Wang, L.J. Wang, W.K. Jia, A fast density peaks clustering algorithm with sparse search, Inform, Sciences 554 (2021) 61–83.
[32]
D.D. Cheng, Q.S. Zhu, J.L. Huang, Q.W. Wu, L.J. Yang, Clustering with Local Density Peaks-Based Minimum Spanning Tree, Ieee T Knowl Data En 33 (2021) 374–387.
[33]
T. Qiu, Y.J. Li, Fast LDP-MST: an efficient density-peak-based clustering method for large-size datasets, IEEE T Knowl Data En 35 (2023) 4767–4780.
[34]
J.Y. Guan, S. Li, X.X. He, J.H. Zhu, J.J. Chen, P. Si, SMMP: A stable-membership-based auto-tuning multi-peak clustering algorithm, IEEE Trans. Pattern Anal. 45 (2023) 6307–6319.
[35]
S. Pourbahrami, A neighborhood-based robust clustering algorithm using Apollonius function kernel, Expert Syst. Appl. 248 (2024).
[36]
J. Friedman, J. Bentley, R. Finkel, An algorithm for finding best matches in logarithmic expected time, ACM Trans. Math. Softw. 3 (1977) 209–226.
[37]
H. Chang, D.-Y. Yeung, Robust path-based spectral clustering, Pattern Recogn. 41 (2008) 191–203.
[38]
P. Franti, O. Virmajoki, V. Hautamaki, Fast agglomerative clustering using a k-nearest neighbor graph, IEEE Trans. Pattern Anal. 28 (2006) 1875–1881.
[39]
F. Samaria, A. Harter, Parameterisation of a stochastic model for human face identification, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, (1994) 138-142.
[40]
D. Cai, X.F. He, J.W. Han, Document clustering using locality preserving indexing, IEEE Trans. Knowl. Data En. 17 (2005) 1624–1637.
[41]
D. Pfitzner, R. Leibbrandt, D. Powers, Characterization and evaluation of similarity measures for pairs of clusterings, Knowl. Inf. Syst. 19 (2009) 361–394.
[42]
P. Franti, M. Rezaei, Q.P. Zhao, Centroid index: Cluster level similarity measure, Pattern Recogn. 47 (2014) 3034–3045.
[43]
N.X. Vinh, J. Epps, J. Bailey, Information theoretic measures for clusterings comparison: is a correction for chance necessary?, in: Proceedings of the 26th Annual International Conference on Machine Learning, Association for Computing Machinery, Montreal, Quebec, Canada, 2009, pp. 1073–1080.
[44]
W.M. Rand, Objective criteria for the evaluation of clustering methods, J. Am. Stat. Assoc. 66 (1971) 846–850.
[45]
E.B. Fowlkes, C.L. Mallows, A method for comparing two hierarchical clusterings, J. Am. Stat. Assoc. 78 (1983) 553–569.

Index Terms

  1. Density peaks clustering based on density voting and neighborhood diffusion
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Information Sciences: an International Journal
        Information Sciences: an International Journal  Volume 681, Issue C
        Oct 2024
        1022 pages

        Publisher

        Elsevier Science Inc.

        United States

        Publication History

        Published: 18 October 2024

        Author Tags

        1. Clustering
        2. Density peaks
        3. Density voting
        4. Neighborhood diffusion

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 15 Jan 2025

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media