[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/PACT58117.2023.00022acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

Parallelizing Maximal Clique Enumeration on GPUs

Published: 26 November 2024 Publication History

Abstract

We present a GPU solution for exact maximal clique enumeration (MCE) that performs a search tree traversal following the Bron-Kerbosch algorithm. Prior works on parallelizing MCE on GPUs perform a breadth-first traversal of the tree, which has limited scalability because of the explosion in the number of tree nodes at deep levels. We propose to parallelize MCE on GPUs by performing depth-first traversal of independent subtrees in parallel. Since MCE suffers from high load imbalance and memory capacity requirements, we propose a worker list for dynamic load balancing, as well as partial induced subgraphs and a compact representation of excluded vertex sets to regulate memory consumption. Our evaluation shows that our GPU implementation on a single GPU outperforms the state-of-the-art parallel CPU implementation by a geometric mean of 4.9 × (up to 16.7 ×), and scales efficiently to multiple GPUs. Our code has been open-sourced to enable further research on accelerating MCE.

References

[1]
E. Gregori, L. Lenzini, and S. Mainardi, "Parallel k-clique community detection on large-scale networks", IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 8, pp. 1651--1660, 2012.
[2]
I. Derényi, G. Palla, and T. Vicsek, "Clique percolation in random networks", Physical review letters, vol. 94, no. 16, p. 160202, 2005.
[3]
W. Gao, K.-F. Wong, Y. Xia, and R. Xu, "Clique percolation method for finding naturally cohesive and overlapping document clusters", in International Conference on Computer Processing of Oriental Languages. Springer, 2006, pp. 97--108.
[4]
Z. Lu, J. Wahlström, and A. Nehorai, "Community detection in complex networks via clique conductance", Scientific reports, vol. 8, no. 1, pp. 1--16, 2018.
[5]
P. Vilakone, D.-S. Park, K. Xinchang, and F. Hao, "An efficient movie recommendation algorithm based on improved k-clique", Human-centric Computing and Information Sciences, vol. 8, no. 1, pp. 1--15, 2018.
[6]
S. Manoharan, "Patient diet recommendation system using k clique and deep learning classifiers", Journal of Artificial Intelligence, vol. 2, no. 02, pp. 121--130, 2020.
[7]
F. Glaria, C. Hernández, S. Ladra, G. Navarro, and L. Salinas, "Compact structure for sparse undirected graphs based on a clique graph partition", Information Sciences, vol. 544, pp. 485--499, 2021.
[8]
R. A. Rossi and R. Zhou, "Graphzip: a clique-based sparse graph compression method", Journal of Big Data, vol. 5, no. 1, pp. 1--14, 2018.
[9]
R. A. Rossi and R. Zhou, "System and method for compressing graphs via cliques", Feb. 26 2019, uS Patent 10,217,241.
[10]
L. Lai, L. Qin, X. Lin, Y. Zhang, L. Chang, and S. Yang, "Scalable distributed subgraph enumeration", Proceedings of the VLDB Endowment, vol. 10, no. 3, pp. 217--228, 2016.
[11]
H. N. Chua, K. Ning, W.-K. Sung, H. W. Leong, and L. Wong, "Using indirect protein-protein interactions for protein complex prediction", in Computational Systems Bioinformatics: (Volume 6). World Scientific, 2007, pp. 97--109.
[12]
M. Pellegrini, M. Baglioni, and F. Geraci, "Protein complex prediction for large protein protein interaction networks with the core&peel method", BMC bioinformatics, vol. 17, no. 12, pp. 37--58, 2016.
[13]
L. Yang, X. Zhao, and X. Tang, "Predicting disease-related proteins based on clique backbone in protein-protein interaction network", International journal of biological sciences, vol. 10, no. 7, p. 677, 2014.
[14]
H. Yu, A. Paccanaro, V. Trifonov, and M. Gerstein, "Predicting interactions in protein networks by completing defective cliques", Bioinformatics, vol. 22, no. 7, pp. 823--829, 2006.
[15]
Z. Shi, C. K. Derow, and B. Zhang, "Co-expression module analysis reveals biological processes, genomic gain, and regulatory mechanisms associated with breast cancer progression", BMC systems biology, vol. 4, no. 1, pp. 1--14, 2010.
[16]
A. Emamjomeh, E. Saboori Robat, J. Zahiri, M. Solouki, and P. Khosravi, "Gene co-expression network reconstruction: a review on computational methods for inferring functional information from plant-based expression data", Plant biotechnology reports, vol. 11, no. 2, pp. 71--86, 2017.
[17]
V. Boginski, S. Butenko, and P. M. Pardalos, "Statistical analysis of financial networks", Computational statistics & data analysis, vol. 48, no. 2, pp. 431--443, 2005.
[18]
C. Bron and J. Kerbosch, "Algorithm 457: finding all cliques of an undirected graph", Communications of the ACM, vol. 16, no. 9, pp. 575--577, 1973.
[19]
T. Alusaifeer, S. Ramanna, C. J. Henry, and J. Peters, "Gpu implementation of mce approach to finding near neighbourhoods", in International Conference on Rough Sets and Knowledge Technology. Springer, 2013, pp. 251--262.
[20]
B. Lessley, T. Perciano, M. Mathai, H. Childs, and E. W. Bethel, "Maximal clique enumeration with data-parallel primitives", in 2017 IEEE 7th Symposium on Large Data Analysis and Visualization (LDAV). IEEE, 2017, pp. 16--25.
[21]
P. Jayaraj, K. Rahamathulla, and G. Gopakumar, "A gpu based maximum common subgraph algorithm for drug discovery applications", in 2016 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, 2016, pp. 580--588.
[22]
Y.-W. Wei, W.-M. Chen, and H.-H. Tsai, "Accelerating the bron-kerbosch algorithm for maximal clique enumeration using gpus", IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 9, pp. 2352--2366, 2021.
[23]
M. Almasri, I. E. Hajj, R. Nagi, J. Xiong, and W.-m. Hwu, "Parallel k-clique counting on gpus", in Proceedings of the 36th ACM International Conference on Supercomputing, 2022, pp. 1--14.
[24]
W.-M. W. Hwu, D. B. Kirk, and I. El Hajj, Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, 2022.
[25]
J. Blanuša, R. Stoica, P. Ienne, and K. Atasu, "Manycore clique enumeration with fast set intersections", Proceedings of the VLDB Endowment, vol. 13, no. 12, pp. 2676--2690, 2020.
[26]
J. Gagneur, R. Krause, T. Bouwmeester, and G. Casari, "Modular decomposition of protein-protein interaction networks", Genome biology, vol. 5, pp. 1--12, 2004.
[27]
E. Tomita, A. Tanaka, and H. Takahashi, "The worst-case time complexity for generating all maximal cliques and computational experiments", Theoretical computer science, vol. 363, no. 1, pp. 28--42, 2006.
[28]
D. Eppstein, M. Löffler, and D. Strash, "Listing all maximal cliques in sparse graphs in near-optimal time", in International Symposium on Algorithms and Computation. Springer, 2010, pp. 403--414.
[29]
-----, "Listing all maximal cliques in large sparse real-world graphs", Journal of Experimental Algorithmics (JEA), vol. 18, pp. 3--1, 2013.
[30]
J. Jenkins, I. Arkatkar, J. D. Owens, A. Choudhary, and N. F. Samatova, "Lessons learned from exploring the backtracking paradigm on the gpu", in European Conference on Parallel Processing. Springer, 2011, pp. 425--437.
[31]
P. Yamout, K. Barada, A. Jaljuli, A. E. Mouawad, and I. El Hajj, "Parallel vertex cover algorithms on gpus", in 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2022, pp. 201--211.
[32]
B. Kerbl, M. Kenzel, J. H. Mueller, D. Schmalstieg, and M. Steinberger, "The broker queue: A fast, linearizable fifo queue for fine-granular work distribution on the gpu", in Proceedings of the 2018 International Conference on Supercomputing, 2018, pp. 76--85.
[33]
D. Merrill, "Cub", NVIDIA Research, 2015.
[34]
J. Leskovec and A. Krevl, "Snap datasets: Stanford large network dataset collection", 2014.
[35]
R. A. Rossi and N. K. Ahmed, "The network data repository with interactive graph analytics and visualization", in AAAI, 2015. [Online]. Available: http://networkrepository.com
[36]
T. Yu and M. Liu, "A linear time algorithm for maximal clique enumeration in large sparse graphs", Information Processing Letters, vol. 125, pp. 35--40, 2017.
[37]
J. Cheng, Y. Ke, A. W.-C. Fu, J. X. Yu, and L. Zhu, "Finding maximal cliques in massive networks", ACM Transactions on Database Systems (TODS), vol. 36, no. 4, pp. 1--34, 2011.
[38]
N. S. Dasari, R. Desh, and Z. M, "pbitmce: A bit-based approach for maximal clique enumeration on multicore processors", in 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), 2014, pp. 478--485.
[39]
T. Yu and M. Liu, "A memory efficient maximal clique enumeration method for sparse graphs with a parallel implementation", Parallel Computing, vol. 87, pp. 46--59, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167819118301297
[40]
M. C. Schmidt, N. F. Samatova, K. Thomas, and B.-H. Park, "A scalable, parallel algorithm for maximal clique enumeration", Journal of Parallel and Distributed Computing, vol. 69, no. 4, pp. 417--428, 2009.
[41]
A. Das, S.-V. Sanei-Mehri, and S. Tirthapura, "Shared-memory parallel maximal clique enumeration from static and dynamic graphs", ACM Transactions on Parallel Computing (TOPC), vol. 7, no. 1, pp. 1--28, 2020.
[42]
J. Cheng, L. Zhu, Y. Ke, and S. Chu, "Fast algorithms for maximal clique enumeration with limited memory", in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012, pp. 1240--1248.
[43]
Y. Xu, J. Cheng, A. W.-C. Fu, and Y. Bu, "Distributed maximal clique computation", in 2014 IEEE International Congress on Big Data. IEEE, 2014, pp. 160--167.
[44]
Q. Chen, C. Fang, Z. Wang, B. Suo, Z. Li, and Z. G. Ives, "Parallelizing maximal clique enumeration over graph data", in International Conference on Database Systems for Advanced Applications. Springer, 2016, pp. 249--264.
[45]
B. Hou, Z. Wang, Q. Chen, B. Suo, C. Fang, Z. Li, and Z. G. Ives, "Efficient maximal clique enumeration over graph data", Data Science and Engineering, vol. 1, no. 4, pp. 219--230, 2016.
[46]
M. Svendsen, A. P. Mukherjee, and S. Tirthapura, "Mining maximal cliques from a large graph using mapreduce: Tackling highly uneven subproblem sizes", Journal of Parallel and distributed computing, vol. 79, pp. 104--114, 2015.
[47]
N. Chiba and T. Nishizeki, "Arboricity and subgraph listing algorithms", SIAM Journal on computing, vol. 14, no. 1, pp. 210--223, 1985.
[48]
I. Finocchi, M. Finocchi, and E. G. Fusco, "Clique counting in mapreduce: Algorithms and experiments", Journal of Experimental Algorithmics (JEA), vol. 20, pp. 1--20, 2015.
[49]
M. Danisch, O. Balalau, and M. Sozio, "Listing k-cliques in sparse real-world graphs", in Proceedings of the 2018 World Wide Web Conference, 2018, pp. 589--598.
[50]
J. Shi, L. Dhulipala, and J. Shun, "Parallel clique counting and peeling algorithms", arXiv preprint arXiv:2002.10047, 2020.
[51]
R.-H. Li, S. Gao, L. Qin, G. Wang, W. Yang, and J. X. Yu, "Ordering heuristics for k-clique listing", Proceedings of the VLDB Endowment, vol. 13, no. 12, pp. 2536--2548, 2020.
[52]
L. Gianinazzi, M. Besta, Y. Schaffner, and T. Hoefler, "Parallel algorithms for finding large cliques in sparse graphs", in Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures, 2021, pp. 243--253.
[53]
A. Lonkar and S. Beamer, "Accelerating clique counting in sparse real-world graphs via communication-reducing optimizations", arXiv preprint arXiv:2112.10913, 2021.
[54]
R. Pagh and C. E. Tsourakakis, "Colorful triangle counting and a mapreduce implementation", Information Processing Letters, vol. 112, no. 7, pp. 277--281, 2012.
[55]
M. Al Hasan and V. S. Dave, "Triangle counting in large networks: a review", Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 8, no. 2, p. e1226, 2018.
[56]
M. Halappanavar and S. Ghosh, "Tric: Distributed-memory triangle counting by exploiting the graph structure", Pacific Northwest National Lab.(PNNL), Richland, WA (United States), Tech. Rep., 2020.
[57]
M. N. Kolountzakis, G. L. Miller, R. Peng, and C. E. Tsourakakis, "Efficient triangle counting in large graphs via degree-based vertex partitioning", Internet Mathematics, vol. 8, no. 1--2, pp. 161--185, 2012.
[58]
T. Steil, T. Reza, K. Iwabuchi, B. W. Priest, G. Sanders, and R. Pearce, "Tripoll: computing surveys of triangles in massive-scale temporal graphs with metadata", in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2021, pp. 1--12.
[59]
M. Almasri, N. Vasudeva, R. Nagi, J. Xiong, and W.-M. Hwu, "Hykernel: A hybrid selection of one/two-phase kernels for triangle counting on gpus", in 2021 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2021, pp. 1--7.
[60]
C. Pearson, M. Almasri, O. Anjum, V. S. Mailthody, Z. Qureshi, R. Nagi, J. Xiong, and W.-m. Hwu, "Update on triangle counting on gpu", in 2019 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2019, pp. 1--7.
[61]
L. Wang and J. D. Owens, "Fast bfs-based triangle counting on gpus", in 2019 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2019, pp. 1--6.
[62]
S. Pandey, X. S. Li, A. Buluc, J. Xu, and H. Liu, "H-index: Hash-indexing for parallel triangle counting on gpus", in 2019 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2019, pp. 1--7.
[63]
Y. Hu, H. Liu, and H. H. Huang, "High-performance triangle counting on gpus", in 2018 IEEE High Performance extreme Computing Conference (HPEC). IEEE, 2018, pp. 1--5.
[64]
V. S. Mailthody, K. Date, Z. Qureshi, C. Pearson, R. Nagi, J. Xiong, and W.-m. Hwu, "Collaborative (cpu+ gpu) algorithms for triangle counting and truss decomposition", in 2018 IEEE High Performance extreme Computing Conference (HPEC). IEEE, 2018, pp. 1--7.
[65]
M. Bisson and M. Fatica, "Update on static graph challenge on gpu", in 2018 IEEE High Performance extreme Computing Conference (HPEC). IEEE, 2018, pp. 1--8.
[66]
O. Green, P. Yalamanchili, and L.-M. Munguía, "Fast triangle counting on the gpu", in Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms, 2014, pp. 1--8.
[67]
O. Green, P. Yalamanchili, and L.-M. Munguía, "Fast triangle counting on the gpu", in Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms, ser. IA¡sup¿3¡/sup¿ '14. IEEE Press, 2014, p. 1--8.
[68]
L. Wang, Y. Wang, C. Yang, and J. D. Owens, "A comparative study on exact triangle counting algorithms on the gpu", in Proceedings of the ACM Workshop on High Performance Graph Processing, 2016, pp. 1--8.
[69]
M. G. Olabi, J. G. Luna, O. Mutlu, W.-m. Hwu, and I. El Hajj, "A compiler framework for optimizing dynamic parallelism on gpus", in 2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). IEEE, 2022, pp. 1--13.
[70]
A. Pinar, C. Seshadhri, and V. Vishal, "Escape: Efficiently counting all 5-vertex subgraphs", in Proceedings of the 26th international conference on world wide web, 2017, pp. 1431--1440.
[71]
N. K. Ahmed, J. Neville, R. A. Rossi, N. G. Duffield, and T. L. Willke, "Graphlet decomposition: Framework, algorithms, and applications", Knowledge and Information Systems, vol. 50, no. 3, pp. 689--722, 2017.
[72]
E. R. Elenberg, K. Shanmugam, M. Borokhovich, and A. G. Dimakis, "Distributed estimation of graph 4-profiles", in Proceedings of the 25th International Conference on World Wide Web, 2016, pp. 483--493.
[73]
R. A. Rossi, R. Zhou, and N. K. Ahmed, "Estimation of graphlet counts in massive networks", IEEE transactions on neural networks and learning systems, vol. 30, no. 1, pp. 44--57, 2018.
[74]
P. Wang, J. Zhao, X. Zhang, Z. Li, J. Cheng, J. C. Lui, D. Towsley, J. Tao, and X. Guan, "Moss-5: A fast method of approximating counts of 5-node graphlets in large graphs", IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 1, pp. 73--86, 2017.
[75]
X. Chen, R. Dathathri, G. Gill, and K. Pingali, "Pangolin: an efficient and flexible graph mining system on cpu and gpu", Proceedings of the VLDB Endowment, vol. 13, no. 10, pp. 1190--1205, 2020.
[76]
W. Guo, Y. Li, and K.-L. Tan, "Exploiting reuse for gpu subgraph enumeration", IEEE Transactions on Knowledge and Data Engineering, 2020.
[77]
L. Wang, Y. Wang, and J. D. Owens, "Fast parallel subgraph matching on the gpu", in HPDC, 2016.
[78]
W. Lin, X. Xiao, X. Xie, and X.-L. Li, "Network motif discovery: A gpu approach", IEEE transactions on knowledge and data engineering, vol. 29, no. 3, pp. 513--528, 2016.
[79]
H.-N. Tran, J.-j. Kim, and B. He, "Fast subgraph matching on large graphs using graphics processors", in International Conference on Database Systems for Advanced Applications. Springer, 2015, pp. 299--315.
[80]
L. Zeng, L. Zou, M. T. Özsu, L. Hu, and F. Zhang, "Gsi: Gpu-friendly subgraph isomorphism", in 2020 IEEE 36th International Conference on Data Engineering (ICDE), 2020, pp. 1249--1260.
[81]
L. Wang and J. D. Owens, "Fast gunrock subgraph matching (gsm) on gpus", arXiv preprint arXiv:2003.01527, 2020.
[82]
X. Chen and A. Satyanarayan, "Efficient and scalable graph pattern mining on gpus", arXiv preprint arXiv:2112.09761, 2021.
[83]
W. Guo, Y. Li, M. Sha, B. He, X. Xiao, and K.-L. Tan, "GPU-accelerated subgraph enumeration on partitioned graphs", in Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020, pp. 1067--1082.
[84]
J. Wang and J. Cheng, "Truss decomposition in massive networks", arXiv preprint arXiv:1205.6693, 2012.
[85]
R. Pearce and G. Sanders, "K-truss decomposition for scale-free graphs at scale in distributed memory", in 2018 IEEE High Performance extreme Computing Conference (HPEC), 2018, pp. 1--6.
[86]
T. M. Low, D. G. Spampinato, A. Kutuluru, U. Sridhar, D. T. Popovici, F. Franchetti, and S. McMillan, "Linear algebraic formulation of edge-centric k-truss algorithms with adjacency matrices", in 2018 IEEE High Performance extreme Computing Conference (HPEC), 2018, pp. 1--7.
[87]
A. Conte, D. De Sensi, R. Grossi, A. Marino, and L. Versari, "Discovering k-trusses in large-scale networks", in 2018 IEEE High Performance extreme Computing Conference (HPEC), 2018, pp. 1--6.
[88]
M. Almasri, O. Anjum, C. Pearson, Z. Qureshi, V. S. Mailthody, R. Nagi, J. Xiong, and W.-m. Hwu, "Update on k-truss decomposition on gpu", in 2019 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2019, pp. 1--7.
[89]
S. Diab, M. G. Olabi, and I. El Hajj, "Ktrussexplorer: Exploring the design space of k-truss decomposition optimizations on gpus", in 2020 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2020, pp. 1--8.
[90]
Y. Che, Z. Lai, S. Sun, Y. Wang, and Q. Luo, "Accelerating truss decomposition on heterogeneous processors", Proceedings of the VLDB Endowment, vol. 13, no. 10, pp. 1751--1764, 2020.
[91]
O. Green, J. Fox, E. Kim, F. Busato, N. Bombieri, K. Lakhotia, S. Zhou, S. Singapura, H. Zeng, R. Kannan, V. Prasanna, and D. Bader, "Quickly finding a truss in a haystack", in 2017 IEEE High Performance Extreme Computing Conference (HPEC), 2017, pp. 1--7.
[92]
M. Bisson and M. Fatica, "Update on static graph challenge on gpu", in 2018 IEEE High Performance extreme Computing Conference (HPEC), 2018, pp. 1--8.
[93]
M. Blanco, T. M. Low, and K. Kim, "Exploration of fine-grained parallelism for load balancing eager k-truss on gpu and cpu", in 2019 IEEE High Performance Extreme Computing Conference (HPEC), 2019, pp. 1--7.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '23: Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques
October 2023
355 pages
ISBN:9798350342543

Sponsors

Publisher

IEEE Press

Publication History

Published: 26 November 2024

Check for updates

Badges

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

PACT '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 5
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)5
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media