CSIM: A Fast Community Detection Algorithm Based on Structure Information Maximization
<p>Example of encoding graph <span class="html-italic">G</span> when the community structure is unknown.</p> "> Figure 2
<p>Example of an encoding graph <span class="html-italic">G</span> when the community structure is known.</p> "> Figure 3
<p>The NMI values between the partitions detected via the four algorithms and the ground truth partitions.</p> "> Figure 4
<p>Comparison of modularity and community structure information in detecting community structure. (<b>a</b>) Community structure detected by optimizing community structure information. (<b>b</b>) Community structure detected by optimizing modularity.</p> ">
Abstract
:1. Introduction
- We introduce a novel approach to defining structural entropy by focusing on the encoding of edge information, named community structure information. This approach calculates the difference in the number of bytes required to encode an edge under unknown and known community structures, capturing the amount of information leaked via the community structure.
- We propose an algorithm, CSIM, for the approximate calculation of the maximum community structure information, which can be employed for community detection. Notably, for a social network with n nodes, the time complexity is the same as that of the Louvain algorithm, with both having an average time complexity of .
- We conducted experiments on real-world network data, and the results demonstrate that the computational output of our proposed algorithm closely approximates the maximum value of community structure information. Furthermore, the community structure obtained through this algorithm aligns more closely with the ground truth community structure.
2. Related Work
2.1. Metrics for Structural Information
2.2. Community Detection
2.3. Structure Entropy
3. Community Structure Information
3.1. The Problem of Modularity
- The contribution term for community in modularity is a linear function of plus a quadratic function of . This implies that the contribution of adding a new edge within community to the value linearly diminishes with the scale of . However, intuitively, this decay should be superlinear. For example, in two communities with the same number of nodes, where one is densely connected internally and the other is sparsely connected, the contribution of a new edge to the sparse community should be significantly greater than to the dense community.
- On the other hand, considering the addition of a new edge between and other communities, although the term implies a penalty for the new edge, this penalty linearly increases with the scale of the community. This is counterintuitive because the penalty for small communities should be high, while for large communities, it should be low. This makes it easier for the optimization of to lead to the merging of small communities into larger ones.
3.2. Community Structure Information
- (i)
- u and v belong to the same community . In this case, we first identify community with a probability of , and then we select u with a probability of and v with a probability of ;
- (ii)
- u and v belong to different communities, and , respectively. In this case, we first identify and with probabilities and , respectively. Then, we independently select u from with a probability of and v from with a probability of .
4. Community Detection Algorithm, CSIM
4.1. Preliminaries
4.2. CSIM
Algorithm 1 Community Structure Information Maximization Algorithm: CSIM |
Input: , ; |
Output: Community structure C and ; |
1: do |
2: Set each node as a community, namely ; |
3: ; |
4: for do |
5: ; |
6: ; |
7: for do |
8: if then |
9: Calculate the value of ; |
10: if then |
11: , ; |
12: end if |
13: end if |
14: put node into of C; |
15: end for |
16: end for |
17: if then |
18: Aggregate communities into super-nodes, and keep track of the members of each super-node; |
19: end if |
20: while |
21: Extract the super nodes in ; |
22: Return: C and |
5. Experiment
5.1. Experimental Settings
5.2. Experimental Analysis
6. Discussion
6.1. Analysis of Contribution
6.2. Implications
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhang, T.; Xu, C.; Lian, Y.; Tian, H.; Kang, J.; Kuang, X.; Niyato, D. When Moving Target Defense Meets Attack Prediction in Digital Twins: A Convolutional and Hierarchical Reinforcement Learning Approach. IEEE J. Sel. Areas Commun. 2023, 41, 3293–3305. [Google Scholar] [CrossRef]
- Fortunato, S. Community detection in graphs. Phys. Rep. 2010, 486, 75–174. [Google Scholar] [CrossRef]
- Zhang, T.; Xu, C.; Shen, J.; Kuang, X.; Grieco, L.A. How to Disturb Network Reconnaissance: A Moving Target Defense Approach Based on Deep Reinforcement Learning. IEEE Trans. Inf. Forensics Secur. 2023, 18, 5735–5748. [Google Scholar] [CrossRef]
- Traud, A.L.; Mucha, P.J.; Porter, M.A. Social Structure of Facebook Networks. Phys. A Stat. Mech. Its Appl. 2012, 391, 4165–4180. [Google Scholar] [CrossRef]
- Ravasz, E.; Somera, A.L.; Mongru, D.A.; Oltvai, Z.N.; Barabasi, A. Hierarchical Organization of Modularity in Metabolic Networks. Science 2002, 297, 1551–1555. [Google Scholar] [CrossRef]
- Zhang, T.; Xu, C.; Zou, P.; Tian, H.; Kuang, X.; Yang, S.; Zhong, L.; Niyato, D. How to Mitigate DDoS Intelligently in SD-IoV: A Moving Target Defense Approach. IEEE Trans. Ind. Inform. 2023, 19, 1097–1106. [Google Scholar] [CrossRef]
- Newman, M.E.J.; Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E 2004, 69, 026113. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Zhang, S.; Wang, R.S.; Zhang, X.S.; Chen, L. Quantitative function for community detection. Phys. Rev. E 2008, 77, 036109. [Google Scholar] [CrossRef] [PubMed]
- Aldecoa, R.; Marín, I. Deciphering network community structure by surprise. PLoS ONE 2011, 6, e24195. [Google Scholar] [CrossRef] [PubMed]
- Chakraborty, T.; Srinivasan, S.; Ganguly, N.; Mukherjee, A.; Bhowmick, S. On the permanence of vertices in network communities. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 1396–1405. [Google Scholar]
- Girvan, M.; Newman, M.E. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 2002, 99, 7821–7826. [Google Scholar] [CrossRef] [PubMed]
- Blondel, V.D.; Guillaume, J.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, 10008. [Google Scholar] [CrossRef]
- Guimera, R.; Amaral, L.A.N. Functional cartography of complex metabolic networks. Nature 2005, 433, 895–900. [Google Scholar] [CrossRef]
- Duch, J.; Arenas, A. Community detection in complex networks using extremal optimization. Phys. Rev. E 2005, 72, 027104. [Google Scholar] [CrossRef]
- Newman, M.E. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 2006, 74, 036104. [Google Scholar] [CrossRef] [PubMed]
- Agarwal, G.; Kempe, D. Modularity-maximizing graph communities via mathematical programming. Eur. Phys. J. B 2008, 66, 409–418. [Google Scholar] [CrossRef]
- Lancichinetti, A.; Fortunato, S. Community detection algorithms: A comparative analysis. Phys. Rev. E 2009, 80, 056117. [Google Scholar] [CrossRef] [PubMed]
- Fortunato, S.; Barthelemy, M. Resolution limit in community detection. Proc. Natl. Acad. Sci. USA 2007, 104, 36–41. [Google Scholar] [CrossRef] [PubMed]
- Yang, L.; Cao, X.; He, D.; Wang, C.; Wang, X.; Zhang, W. Modularity Based Community Detection with Deep Learning. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), New York, NY, USA, 9–15 July 2016; Volume 16, pp. 2252–2258. [Google Scholar]
- Li, A.; Li, J.; Pan, Y. Discovering natural communities in networks. Phys. A Stat. Mech. Its Appl. 2015, 436, 878–896. [Google Scholar] [CrossRef]
- Rashevsky, N. Life, Information Theory, and Topology. Bull. Math. Biophys. 1955, 17, 229–235. [Google Scholar] [CrossRef]
- Braunstein, S.L.; Ghosh, S.; Severini, S. The laplacian of a graph as a density matrix: A basic combinatorial approach to separability of mixed states. Ann. Comb. 2006, 10, 291–317. [Google Scholar] [CrossRef]
- Dehmer, M. Information processing in complex networks: Graph entropy and information functionals. Appl. Math. Comput. 2008, 201, 82–94. [Google Scholar] [CrossRef]
- Anand, K.; Bianconi, G. Entropy measures for networks: Toward an information theory of complex topologies. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 2009, 80, 045102. [Google Scholar] [CrossRef] [PubMed]
- Li, A.; Pan, Y. Structural Information and Dynamical Complexity of Networks. IEEE Trans. Inf. Theory 2016, 62, 3290–3339. [Google Scholar] [CrossRef]
- Newman, M.E. Detecting community structure in networks. Eur. Phys. J. B 2004, 38, 321–330. [Google Scholar] [CrossRef]
- Rosvall, M.; Bergstrom, C.T. Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. USA 2008, 105, 1118–1123. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Q.; Li, M.; Deng, Y. A new structure entropy of complex networks based on nonextensive statistical mechanics. Int. J. Mod. Phys. C 2016, 27, 1650118. [Google Scholar] [CrossRef]
- Liu, Y.; Liu, J.; Wan, K.; Qin, Z.; Zhang, Z.; Khoussainov, B.; Zhu, L. From local to global norm emergence: Dissolving self-reinforcing substructures with incremental social instruments. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 6871–6881. [Google Scholar]
- Zhang, Q.; Li, M. A betweenness structural entropy of complex networks. Chaos Solitons Fractals 2022, 161, 112264. [Google Scholar] [CrossRef]
- Cai, M.; Liu, J.; Cui, Y. A Network Structure Entropy Considering Series-Parallel Structures. Entropy 2022, 24, 852. [Google Scholar] [CrossRef] [PubMed]
- Weiss, R.S.; Jacobson, E. A method for the analysis of the structure of complex organizations. Am. Sociol. Rev. 1955, 20, 661–668. [Google Scholar] [CrossRef]
- McCorry, P.; Möser, M.; Shahandasti, S.F.; Hao, F. Towards bitcoin payment networks. In Proceedings of the Australasian Conference on Information Security and Privacy, Melbourne, Australia, 4–6 July 2016; Springer: Cham, Switzerland, 2016; pp. 57–76. [Google Scholar]
- Vidal, M.; Cusick, M.E.; Barabási, A.L. Interactome networks and human disease. Cell 2011, 144, 986–998. [Google Scholar] [CrossRef]
- Barabási, A.L.; Gulbahce, N.; Loscalzo, J. Network medicine: A network-based approach to human disease. Nat. Rev. Genet. 2011, 12, 56–68. [Google Scholar] [CrossRef]
- Gregory, S. Local Betweenness for Finding Communities in Networks; University of Bristol: Bristol, UK, 2008. [Google Scholar]
- Shi, J.; Malik, J. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 888–905. [Google Scholar]
- Von Luxburg, U. A tutorial on spectral clustering. Stat. Comput. 2007, 17, 395–416. [Google Scholar] [CrossRef]
- Brandes, U.; Delling, D.; Gaertler, M.; Gorke, R.; Hoefer, M.; Nikoloski, Z.; Wagner, D. On modularity clustering. IEEE Trans. Knowl. Data Eng. 2007, 20, 172–188. [Google Scholar] [CrossRef]
- Chen, M.; Kuzmin, K.; Szymanski, B.K. Community detection via maximization of modularity and its variants. IEEE Trans. Comput. Soc. Syst. 2014, 1, 46–65. [Google Scholar] [CrossRef]
- Zhuang, D.; Chang, J.M.; Li, M. DynaMo: Dynamic community detection by incrementally maximizing modularity. IEEE Trans. Knowl. Data Eng. 2019, 33, 1934–1945. [Google Scholar] [CrossRef]
- Fortunato, S.; Hric, D. Community detection in networks: A user guide. Phys. Rep. 2016, 659, 1–44. [Google Scholar] [CrossRef]
- Jin, D.; Yu, Z.; Jiao, P.; Pan, S.; He, D.; Wu, J.; Philip, S.Y.; Zhang, W. A survey of community detection approaches: From statistical modeling to deep learning. IEEE Trans. Knowl. Data Eng. 2021, 35, 1149–1170. [Google Scholar] [CrossRef]
- Ruggeri, N.; Contisciani, M.; Battiston, F.; De Bacco, C. Community detection in large hypergraphs. Sci. Adv. 2023, 9, eadg9159. [Google Scholar] [CrossRef]
- Bronstein, M.M.; Bruna, J.; LeCun, Y.; Szlam, A.; Vandergheynst, P. Geometric deep learning: Going beyond euclidean data. IEEE Signal Process. Mag. 2017, 34, 18–42. [Google Scholar] [CrossRef]
- Qu, L.; Zhu, H.; Duan, Q.; Shi, Y. Continuous-time link prediction via temporal dependent graph neural network. In Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 3026–3032. [Google Scholar]
- Dai, H.; Kozareva, Z.; Dai, B.; Smola, A.; Song, L. Learning steady-states of iterative algorithms over graphs. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 1106–1114. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Yu, W.; Zheng, C.; Cheng, W.; Aggarwal, C.C.; Song, D.; Zong, B.; Chen, H.; Wang, W. Learning deep network representations with adversarially regularized autoencoders. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 2663–2671. [Google Scholar]
- Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 922–929. [Google Scholar]
- Shannon, C. The lattice theory of information. Trans. IRE Prof. Group Inf. Theory 1953, 1, 105–107. [Google Scholar] [CrossRef]
- Li, A.; Zhang, X.; Pan, Y. Resistance maximization principle for defending networks against virus attack. Phys. A Stat. Mech. Appl. 2017, 466, 211–223. [Google Scholar] [CrossRef]
- Li, A.; Yin, X.; Xu, B.; Wang, D.; Han, J.; Wei, Y.; Deng, Y.; Xiong, Y.; Zhang, Z. Decoding topologically associating domains with ultra-low resolution Hi-C data by graph structural entropy. Nat. Commun. 2018, 9, 3265. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.; Liu, J.; Zhang, Z.; Zhu, L.; Li, A. REM: From structural entropy to community structure deception. Adv. Neural Inf. Process. Syst. 2019, 32, 12938–12948. [Google Scholar]
- Hirai, S.; Yamanishi, K. Detecting latent structure uncertainty with structural entropy. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 26–35. [Google Scholar]
- Wang, H.; Liu, Y.; Yin, P.; Zhang, H.; Xu, X.; Wen, Q. Label specificity attack: Change your label as I want. Int. J. Intell. Syst. 2022, 37, 7767–7786. [Google Scholar] [CrossRef]
- Tian, Y.; Zhang, Z.; Xiong, J.; Chen, L.; Ma, J.; Peng, C. Achieving graph clustering privacy preservation based on structure entropy in social IoT. IEEE Internet Things J. 2021, 9, 2761–2777. [Google Scholar] [CrossRef]
- Liu, W.; Liu, J.; Zhang, Z.; Liu, Y.; Zhu, L. Residual Entropy-based Graph Generative Algorithms. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, Auckland, New Zealand, 9–13 May 2022; pp. 816–824. [Google Scholar]
- Wan, K.; Liu, J.; Liu, Y.; Zhang, Z.; Khoussainov, B. Attacking community detectors: Mislead detectors via manipulating the graph structure. In Proceedings of the International Conference on Mobile Computing, Applications, and Services, Virtual, 23–24 November 2021; Springer: Cham, Switzerland, 2021; pp. 112–128. [Google Scholar]
- Zhang, S.; Liu, J.; Liu, Y.; Zhang, Z.; Khoussainov, B. Improving togetherness using structural entropy. In Proceedings of the International Conference on Mobile Computing, Applications, and Services, Virtual, 23–24 November 2021; Springer: Cham, Switzerland, 2021; pp. 85–98. [Google Scholar]
- Reichardt, J.; Bornholdt, S. Statistical mechanics of community detection. Phys. Rev. E 2006, 74, 016110. [Google Scholar] [CrossRef]
- Clauset, A.; Newman, M.E.J.; Moore, C. Finding community structure in very large networks. Phys. Rev. E 2004, 70, 066111. [Google Scholar] [CrossRef]
- Newman, M.E.J. Modularity and community structure in networks. Proc. Natl. Acad. Sci. USA 2006, 103, 8577–8582. [Google Scholar] [CrossRef] [PubMed]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
- Vinh, N.X.; Epps, J.; Bailey, J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 2010, 11, 2837–2854. [Google Scholar]
Data G | Type | Description | ||
---|---|---|---|---|
Yeast | 688 | 1079 | Microorganism | Transcriptional regulatory network in brewing yeast |
E. coli | 423 | 519 | Microorganism | Transcriptional regulatory network in Escherichia coli |
Elect. circuit | 512 | 819 | Electronic | Electronic circuit network of electronic components |
Social | 67 | 182 | Social network | Social network of positive emotions among individuals |
C. elegans | 306 | 2345 | Animal | Neural network of Caenorhabditis elegans |
Data G | Type | Description | ||
---|---|---|---|---|
Karate | 34 | 78 | Social network | Social network among members of karate clubs |
Dolphin | 62 | 159 | Animal | Social network of associations among dolphins |
Jazz | 198 | 2742 | Social network | A collaboration network among jazz musicians |
1133 | 5451 | Communication | Email communication network among members of a university in Spain | |
2888 | 2981 | Online social | Friendship network among selected users on Facebook | |
Power grid | 4941 | 6594 | Infrastructure | Topological network of the power grid in the western United States |
PGP | 10,680 | 24,316 | Online social | Interacting network among PGP users |
Data G | RC(G) via CSIM | RC(G) via R_greedy |
---|---|---|
Yeast | 3.849 | 3.847 |
E. coli | 4.005 | 4.032 |
Elect. circuit | 4.117 | 4.097 |
Social | 2.494 | 2.488 |
C. elegans | 1.565 | 1.548 |
Karate | 1.352 | 1.298 |
Dolphin | 1.750 | 1.743 |
Jazz | 1.434 | 1.308 |
2.676 | 2.475 | |
2.723 | 2.723 | |
Power grid | 7.019 | 6.996 |
PGP | 6.647 | 6.476 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Y.; Liu, W.; Tang, X.; Yin, H.; Yin, P.; Xu, X.; Wang, Y. CSIM: A Fast Community Detection Algorithm Based on Structure Information Maximization. Electronics 2024, 13, 1119. https://doi.org/10.3390/electronics13061119
Liu Y, Liu W, Tang X, Yin H, Yin P, Xu X, Wang Y. CSIM: A Fast Community Detection Algorithm Based on Structure Information Maximization. Electronics. 2024; 13(6):1119. https://doi.org/10.3390/electronics13061119
Chicago/Turabian StyleLiu, Yiwei, Wencong Liu, Xiangyun Tang, Hao Yin, Peng Yin, Xin Xu, and Yanbin Wang. 2024. "CSIM: A Fast Community Detection Algorithm Based on Structure Information Maximization" Electronics 13, no. 6: 1119. https://doi.org/10.3390/electronics13061119
APA StyleLiu, Y., Liu, W., Tang, X., Yin, H., Yin, P., Xu, X., & Wang, Y. (2024). CSIM: A Fast Community Detection Algorithm Based on Structure Information Maximization. Electronics, 13(6), 1119. https://doi.org/10.3390/electronics13061119