Abstract
Social network analysis has become an important topic for researchers in sociology and computer science. Similarities among individuals form communities as the basic constitutions of social networks. Regarding the importance of communities, community detection is a fundamental step in the study of social networks typically modeled as large-scale graphs. Detecting communities in such large-scale graphs which generally suffers from the curse of dimensionality is the main objective followed in this study. An efficient modularity-based community detection algorithm called MDPCluster is introduced in order to detect communities in large-scale graphs in a timely manner. To address the high dimensionality problem, first, a Louvain-based algorithm is utilized by MDPCluster to distinguish initial communities as super-nodes and then a Modified Discrete Particle Swarm Optimization algorithm, called MDPSO is leveraged to detect communities through maximizing modularity measure. MDPSO discretizes Particle Swarm Optimization using the idea of transmission tendency and also escapes from premature convergence thereby a mutation operator inspired by Genetic Algorithm. To evaluate the proposed method, six standard datasets, i.e., American College Football, Books about US Politics, Amazon Product Co-purchasing, DBLP, GR-QC and HEP-TH have been employed. The first two are known as synthetic datasets whereas the rest are real-world datasets. In comparison to eight state-of-the-art algorithms, i.e., Stationary Genetic Algorithm, Generational Genetic Algorithm, Simulated Annealing-Stationary Genetic Algorithm, Simulated Annealing-Generational Genetic Algorithm, Grivan–Newman, Danon and Label Propagation Algorithm, the results indicate the superiorities of MDCluster in terms of modularity, Normalized Mutual Information and execution time as well.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Modularity and an Improved Genetic Algorithm.
References
González-Bailón S (2013) Social science in the era of big data. Policy Int 5(2):147–160
Javadi, SHS, Gharani P, Khadivi S (2018) Detecting community structure in dynamic social networks using the concept of leadership. In: Sustainable interdependent networks. Springer, pp 97–118
Jiang JQ, McQuay LJ (2012) Modularity functions maximization with nonnegative relaxation facilitates community detection in networks. Physica A 391(3):854–865
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213
Kadirkamanathan V, Selvarajah K, Fleming PJ (2006) Stability analysis of the particle dynamics in particle swarm optimizer. IEEE Trans Evol Comput 10(3):245–255
Du K-L, Swamy M (2016) Particle swarm optimization. In: Search and optimization by metaheuristics. Springer, pp 153–173
Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57
Yuan X, Wang L, Yuan Y (2008) Application of enhanced PSO approach to optimal scheduling of hydro system. Energy Convers Manag 49(11):2966–2972
Liu B, Wang L, Jin Y-H (2007) An effective PSO-based memetic algorithm for flow shop scheduling. IEEE Trans Syst Man Cybern Part B Cybern 37(1):18–27
Liu B, Wang L, Jin Y, Huang D (2005) Advances in particle swarmoptimization algorithm. Control Instrum Chem Ind 32(3):1–6
Aung TT, Nyunt TTS (2019) Discrete artificial bee colony algorithm for community detection in social networks. In: Seventeenth international conference on computer applications (ICCA 2019)
Ghoshal AK, Das N, Bhattacharjee S, Chakraborty G (2019) A fast parallel genetic algorithm based approach for community detection in large networks. In: 2019 11th international conference on communication systems and networks (COMSNETS). IEEE, pp 95–101
Guerrero M, Montoya FG, Baños R, Alcayde A, Gil C (2017) Adaptive community detection in complex networks using genetic algorithms. Neurocomputing 266:101–113
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Traag VA, Waltman L, van Eck NJ (2019) From Louvain to Leiden: guaranteeing well-connected communities. Scientific Rep 9:e5233
Guimera R, Danon L, Diaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68(6):065103
Que X, Checconi F, Petrini F, Gunnels JA (2015) Scalable community detection with the louvain algorithm. In: 2015 IEEE international on parallel and distributed processing symposium (IPDPS). IEEE, pp 28–37
Newman ME (2012) Communities, modules and large-scale structure in networks. Nat Phys 8(1):25
Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1(1):27–64
Rossi F, Villa-Vialaneix N (2010) Optimizing an organized modularity measure for topographic graph clustering: a deterministic annealing approach. Neurocomputing 73(7–9):1142–1163
Wang R-S, Zhang S, Wang Y, Zhang X-S, Chen L (2008) Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing 72(1–3):134–141
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008
Newman ME (2001) Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev E 64(1):016132
Shang R, Bai J, Jiao L, Jin C (2013) Community detection based on modularity and an improved genetic algorithm. Physica A 392(5):1215–1231
Moradi M, Parsa S (2019) An evolutionary method for community detection using a novel local search strategy. Physica A 523:457–475
Ying W et al (2019) Parallel conical area community detection using evolutionary multi-objective optimization. Processes 7(2):111
Ji J, Song X, Liu C, Zhang X (2013) Ant colony clustering with fitness perception and pheromone diffusion for community detection in complex networks. Physica A 392(15):3260–3272
Cai Q, Gong M, Shen B, Ma L, Jiao L (2014) Discrete particle swarm optimization for identifying community structures in signed social networks. Neural Netw 58:4–13
Hassan EA, Hafez AI, Hassanien AE, Fahmy AA (2014) Community detection algorithm based on artificial fish swarm optimization. In: Intelligent Systems’ 2014. Springer, pp 509–521
Wang Y, Xu W, Kang Q (2018) AFSMA: an enhanced artificial fish swarm algorithm based on mutuality for community detection. In: Proceedings of the 2nd international conference on big data research. ACM, pp 107–113
Li X, Wu X, Xu S, Qing S, Chang P-C (2019) A novel complex network community detection approach using discrete particle swarm optimization with particle diversity and mutation. Appl Soft Comput 81:105476
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, 1995. MHS’95. IEEE, pp 39–43
Kennedy R, Eberhart J (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks IV, vol 1000, p 1995
Yang X-S (2014) Swarm intelligence based algorithms: a critical analysis. Evol Intell 7(1):17–28
Yang X-S, He X (2016) Nature-inspired optimization algorithms in engineering: overview and applications. In: Nature-inspired computation in engineering. Springer, pp 1–20
Yu YF, Li G, Xu C (2013) An improved particle swarm optimization algorithm. In: Applied mechanics and materials. Trans Tech Publ, vol 401, pp 1328–1335
Selvi V, Umarani DR (2010) Comparative analysis of ant colony and particle swarm optimization techniques. Int J Comput Appl 5(4):1–6
Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: The 1998 IEEE international conference on evolutionary computation proceedings, 1998. IEEE World Congress on Computational Intelligence. IEEE, pp 69–73
Clerc M (2004) Discrete particle swarm optimization, illustrated by the traveling salesman problem. In: New optimization techniques in engineering. Springer, pp 219–239
Pan Q-K, Tasgetiren MF, Liang Y-C (2008) A discrete particle swarm optimization algorithm for the no-wait flowshop scheduling problem. Comput Oper Res 35(9):2807–2839
Strasser S, Goodman R, Sheppard J, Butcher S (2016) A new discrete particle swarm optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference 2016. ACM, pp 53–60
Wu H, Nie C, Kuo F-C, Leung H, Colbourn CJ (2015) A discrete particle swarm optimization for covering array generation. IEEE Trans Evol Comput 19(4):575–591
Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Balas VE (2017) Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput Appl 28(8):2005–2016
Momeni E, Armaghani DJ, Hajihassani M, Amin MFM (2015) Prediction of uniaxial compressive strength of rock samples using hybrid particle swarm optimization-based artificial neural networks. Measurement 60:50–63
Lynn N, Suganthan PN (2015) Distance based locally informed particle swarm optimizer with dynamic population size. In: Proceedings of the 18th Asia Pacific symposium on intelligent and evolutionary systems-volume 2. Springer, pp 577–587
Meunier D, Lambiotte R, Fornito A, Ersche K, Bullmore ET (2009) Hierarchical modularity in human brain functional networks. Front Neuroinform 3:37
Pujol JM, Erramilli V, Rodriguez P (2009) Divide and conquer: partitioning online social networks. arXiv preprint arXiv:0905.4918
Raeder T, Chawla NV (2011) Market basket analysis with networks. Social Netw Anal Min 1(2):97–113
Newman ME (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582
Leskovec J, Kleinberg J, Faloutsos C (2007) Graph evolution: Densification and shrinking diameters. ACM Trans Knowl Discovery Data (TKDD) 1(1):2
Boseniuk T, Ebeling W (1990) Boltzmann-, Darwin-and Haeckel-strategies in optimization problems. In International conference on parallel problem solving from nature. Springer, pp 429–444
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Cai X, Shi Y, Zhu Y, Qiao Y, Hu F (2017) An algorithm Q-PSO for community detection in complex networks. In 2017 16th international symposium on distributed computing and applications to business, engineering and science (DCABES). IEEE, pp 76–79
Bilal S, Abdelouahab M (2017) Evolutionary algorithm and modularity for detecting communities in networks. Physica A 473:89–96
Gong M, Fu B, Jiao L, Du H (2011) Memetic algorithm for community detection in networks. Phys Rev E 84(5):056101
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Figure 13 shows the pseudocode of PSO algorithm. In this algorithm, the main loop iterates as many times as total iterations or as it is known as T. In this loop, there is an inner loop which runs as many times as the number of individuals in the population, Np, in which velocities and positions are updated. The fitness function is calculated for all the individuals of the population as many times as the number of problem dimensions or |V|. The time complexity of calculating fitness function is \( O\left( {\left| V \right|^{2} } \right) \) in this paper. Therefore, the time complexity of PSO is \( O\left( {\left( {\left| V \right|*N_{p} + \left| V \right|^{2} *N_{p} } \right)*T} \right) \), and with some simplification, the time complexity of the algorithm is \( O\left( {\left( {\left| V \right|^{2} *N_{p} } \right)*T} \right) \).
In GA, the main loop iterates as many times as all the iterations T. in this loop, there is an inner loop which runs as many times as the number of individuals in population Np in which the fitness function for all individual is calculated and mutation and cross-over are applied on population by mutation and cross-over rates. The time complexity of fitness function in this problem (modularity) is \( O\left( {\left| V \right|^{2} } \right) \). Thus, time complexity of GA would be \( O\left( {\left( {\left| V \right|^{2} *N_{p} } \right)*T*(p_{c} *O\left( {cross over} \right) + p_{m} *O\left( {mutation} \right))} \right) \). With some simplifications we have \( O\left( {\left( {\left| V \right|^{2} *N_{p} } \right)*T} \right) \) for the time complexity of the algorithm.
See Fig. 17.
Rights and permissions
About this article
Cite this article
Fozuni Shirjini, M., Farzi, S. & Nikanjam, A. MDPCluster: a swarm-based community detection algorithm in large-scale graphs. Computing 102, 893–922 (2020). https://doi.org/10.1007/s00607-019-00787-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-019-00787-4