More Web Proxy on the site http://driver.im/

research-article

Density peaks clustering based on density voting and neighborhood diffusion

Authors:

Hui LiAuthors Info & Claims

Volume 681, Issue C

https://doi.org/10.1016/j.ins.2024.121209

Published: 01 October 2024 Publication History

Abstract

Density Peaks Clustering (DPC) is a well-known clustering technique in the data mining field with fewer parameters as well as no iteration. However, when dealing with datasets containing multiple peaks, DPC may subjectively choose the wrong cluster centers through the decision graph. Additionally, DPC requires a considerable amount of time to estimate density and relative distance. Moreover, DPC is sensitive to the value of cut-off distance. To overcome these issues, a density peaks clustering algorithm based on density voting and neighborhood diffusion (DPC-DVND) is proposed. Firstly, the proposed algorithm utilizes the k nearest neighbors and KD-tree to enhance the efficiency of computing local density and relative distance. Secondly, this study selects the potential cluster centers by density voting and applies the number of votes instead of density to calculate the feasibility of each potential center becoming a cluster center, so that the centers of low-density clusters can be better distinguished. Finally, two neighborhood density diffusion rules are designed to propagate labels and form the core structure of clusters. Experiments on synthetic, real, and image datasets are performed to compare different methods. Results show that DPC-DVND outperforms other state-of-the-art algorithms in terms of effectiveness and efficiency.

References

[1]

R. Maheshwari, A.C. Mishra, S.K. Mohanty, An entropy-based density peak clustering for numerical gene expression datasets, Appl. Soft Comput. 142 (2023).

[2]

Y.Y. Niu, D.T. Kong, L.G. Liu, R. Wen, J.H. Xiao, Overlapping community detection with adaptive density peaks clustering and iterative partition strategy, Expert Syst. Appl. 213 (2023).

[3]

R.F. Mansour, Blockchain assisted clustering with Intrusion Detection System for Industrial Internet of Things environment, Expert Syst. Appl. 207 (2022).

[4]

B. Inje, K.K. Nagwanshi, R.K. Rambola, An efficient document information retrieval using hybrid global search optimization algorithm with density based clustering technique, Cluster Comput 27 (2024) 689–705.

[5]

C.M. Hoang, B. Kang, Pixel-level clustering network for unsupervised image segmentation, Eng. Appl. Artif. Intel. 127 (2024).

[6]

M. Ester, H.P. Kriegel, J. Sander, X. Xu, A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, AAAI Press, 1996.

Digital Library

[7]

J.B. Macqueen, Some Methods for Classification and Analysis of Multivariate Observations, (1966).

[8]

I.C. Gormley, T.B. Murphy, A.E. Raftery, Model-Based Clustering, Annu. Rev. Stat. Appl. 10 (2023) 573–595.

[9]

N. Fu, W.W. Ni, H.B. Hu, S. Zhang, Multidimensional grid-based clustering with local differential privacy, Inform. Sci. 623 (2023) 402–420.

[10]

S. Wang, L.Z. Hao, X.F. Wang, J.H. Guo, Paralinear distance and its algorithm for hierarchical clustering of high-dimensional discrete variables, Int. J. Approx. Reason. 167 (2024).

[11]

L. Hu, Y. Yang, Z.H. Tang, Y.Z. He, X. Luo, FCAN-MOPSO: An improved fuzzy-based graph clustering algorithm for complex networks with multiobjective particle swarm optimization, IEEE Trans. Fuzzy Syst. 31 (2023) 3470–3484.

[12]

L. Colomba, L. Cagliero, P. Garza, Density-Based Clustering by Means of Bridge Point Identification, IEEE Trans. Knowl. Data En. 35 (2023) 11274–11287.

[13]

A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science 344 (2014) 1492–1496.

[14]

Y. Li, L.Y. Sun, Y.C. Tang, DPC-FSC: An approach of fuzzy semantic cells to density peaks clustering, Inform, Sciences 616 (2022) 88–107.

[15]

Y.Q. Yang, J.H. Cai, H.F. Yang, X.J. Zhao, Density clustering with divergence distance and automatic center selection, Inform, Sciences 596 (2022) 414–438.

[16]

J.Y. Guan, S. Li, X.X. He, J.J. Chen, Clustering by fast detection of main density peaks within a peak digraph, Inform, Sciences 628 (2023) 504–521.

[17]

Y.W. Chen, X.L. Hu, W.T. Fan, L.L. Shen, Z. Zhang, X. Liu, J.X. Du, H.B. Li, Y. Chen, H.L. Li, Fast density peak clustering for large scale data based on kNN, Knowl.-Based Syst. 187 (2020).

[18]

H.J. Huang, H. Wu, X.X. Wei, Y.Q. Zhou, Optimization of Density Peak Clustering Algorithm Based on Improved Black Widow Algorithm, Biomimetics-Basel 9 (2024).

[19]

T.F. Gao, D. Chen, Y.B. Tang, B. Du, R. Ranjan, A.Y. Zomaya, S. Dustdar, Adaptive density peaks clustering: Towards exploratory EEG analysis, Knowl.-Based Syst. 240 (2022).

Digital Library

[20]

K.K. Qiao, J.W. Chen, S.K. Duan, Self-adaptive two-stage density clustering method with fuzzy connectivity, Appl. Soft Comput. 154 (2024).

[21]

S.F. Ding, W. Du, X. Xu, T.H. Shi, Y.R. Wang, C. Li, An improved density peaks clustering algorithm based on natural neighbor with a merging strategy, Inform, Sciences 624 (2023) 252–276.

[22]

J. Xie, X. Liu, M. Wang, SFKNN-DPC: Standard deviation weighted distance based density peak clustering algorithm, Inform Sciences 653 (2024).

[23]

L. Sun, X.Y. Qin, W.P. Ding, J.C. Xu, Nearest neighbors-based adaptive density peaks clustering with optimized allocation strategy, Neurocomputing 473 (2022) 159–181.

[24]

J. Zhao, G. Wang, J.S. Pan, T.H. Fan, I.V. Lee, Density peaks clustering algorithm based on fuzzy and weighted shared neighbor for uneven density datasets, Pattern Recogn. 139 (2023).

[25]

W.J. Guo, W.H. Wang, S.P. Zhao, Y.L. Niu, Z.Y. Zhang, X.G. Liu, Density Peak Clustering with connectivity estimation, Knowl.-Based Syst. 243 (2022).

Digital Library

[26]

M. Abbas, A. El-Zoghabi, A. Shoukry, DenMune: Density peak based clustering using mutual nearest neighbors, Pattern Recogn. 109 (2021).

[27]

L.M. Guo, W.J. Qin, Z. Cai, X. Su, Hybrid Clustering Algorithm Based on Improved Density Peak Clustering, Appl Sci-Basel 14 (2024).

[28]

Z.C. Shi, R.Z. Guo, Z.G. Zhao, An improved hierarchical clustering method based on the k-NN and density peak clustering, T Gis 27 (2023) 2197–2212.

[29]

M.S. Han, J.S. Lee, Graph-based density peak merging for identifying multi-peak clusters, Appl. Soft Comput. 146 (2023).

[30]

Z.G. Long, Y. Gao, H. Meng, Y.Q. Yao, T.R. Li, Clustering based on local density peaks and graph cut, Inform, Sciences 600 (2022) 263–286.

[31]

X. Xu, S.F. Ding, Y.R. Wang, L.J. Wang, W.K. Jia, A fast density peaks clustering algorithm with sparse search, Inform, Sciences 554 (2021) 61–83.

[32]

D.D. Cheng, Q.S. Zhu, J.L. Huang, Q.W. Wu, L.J. Yang, Clustering with Local Density Peaks-Based Minimum Spanning Tree, Ieee T Knowl Data En 33 (2021) 374–387.

[33]

T. Qiu, Y.J. Li, Fast LDP-MST: an efficient density-peak-based clustering method for large-size datasets, IEEE T Knowl Data En 35 (2023) 4767–4780.

[34]

J.Y. Guan, S. Li, X.X. He, J.H. Zhu, J.J. Chen, P. Si, SMMP: A stable-membership-based auto-tuning multi-peak clustering algorithm, IEEE Trans. Pattern Anal. 45 (2023) 6307–6319.

[35]

S. Pourbahrami, A neighborhood-based robust clustering algorithm using Apollonius function kernel, Expert Syst. Appl. 248 (2024).

[36]

J. Friedman, J. Bentley, R. Finkel, An algorithm for finding best matches in logarithmic expected time, ACM Trans. Math. Softw. 3 (1977) 209–226.

Digital Library

[37]

H. Chang, D.-Y. Yeung, Robust path-based spectral clustering, Pattern Recogn. 41 (2008) 191–203.

[38]

P. Franti, O. Virmajoki, V. Hautamaki, Fast agglomerative clustering using a k-nearest neighbor graph, IEEE Trans. Pattern Anal. 28 (2006) 1875–1881.

[39]

F. Samaria, A. Harter, Parameterisation of a stochastic model for human face identification, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, (1994) 138-142.

[40]

D. Cai, X.F. He, J.W. Han, Document clustering using locality preserving indexing, IEEE Trans. Knowl. Data En. 17 (2005) 1624–1637.

[41]

D. Pfitzner, R. Leibbrandt, D. Powers, Characterization and evaluation of similarity measures for pairs of clusterings, Knowl. Inf. Syst. 19 (2009) 361–394.

[42]

P. Franti, M. Rezaei, Q.P. Zhao, Centroid index: Cluster level similarity measure, Pattern Recogn. 47 (2014) 3034–3045.

[43]

N.X. Vinh, J. Epps, J. Bailey, Information theoretic measures for clusterings comparison: is a correction for chance necessary?, in: Proceedings of the 26th Annual International Conference on Machine Learning, Association for Computing Machinery, Montreal, Quebec, Canada, 2009, pp. 1073–1080.

[44]

W.M. Rand, Objective criteria for the evaluation of clustering methods, J. Am. Stat. Assoc. 66 (1971) 846–850.

[45]

E.B. Fowlkes, C.L. Mallows, A method for comparing two hierarchical clusterings, J. Am. Stat. Assoc. 78 (1983) 553–569.

Index Terms

Density peaks clustering based on density voting and neighborhood diffusion
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Cluster analysis
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Clustering and classification
  2. Information systems applications
    1. Data mining
      1. Clustering

Index terms have been assigned to the content through auto-classification.

Recommendations

Density Peaks Clustering Algorithm Based on Density Stratification and Subcluster Fusion
ACAI '22: Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence

In the density peaks clustering algorithm (DPC), points in high density regions are more likely to be selected as cluster centers, resulting in poor performance on complex datasets that are of various densities. A Density Peaks Clustering Algorithm ...
Clustering of Multiple Density Peaks
Advances in Knowledge Discovery and Data Mining
Abstract
Density-based clustering, such as Density Peak Clustering (DPC) and DBSCAN, can find clusters with arbitrary shapes and have wide applications such as image processing, spatial data mining and text mining. In DBSCAN, a core point has density ...
Adaptive fuzzy clustering by fast search and find of density peaks

Clustering by fast search and find of density peaks (CFSFDP) is proposed to cluster the data by finding of density peaks. CFSFDP is based on two assumptions that: a cluster center is a high dense data point as compared to its surrounding neighbors, and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Sciences: an International Journal

Information Sciences: an International Journal Volume 681, Issue C

Oct 2024

1022 pages

Issue’s Table of Contents

Elsevier Inc.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 October 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents