More Web Proxy on the site http://driver.im/

research-article

GraphCube: Interconnection Hierarchy-aware Graph Processing

Authors:

Zheng WangAuthors Info & Claims

PPoPP '24: Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming

Pages 160 - 174

https://doi.org/10.1145/3627535.3638498

Published: 20 February 2024 Publication History

Abstract

Processing large-scale graphs with billions to trillions of edges requires efficiently utilizing parallel systems. However, current graph processing engines do not scale well beyond a few tens of computing nodes because they are oblivious to the communication cost variations across the interconnection hierarchy. We introduce GraphCube, a better approach to optimizing graph processing on large-scale parallel systems with complex interconnections. GraphCube features a new graph partitioning approach to achieve better load balancing and minimize communication overhead across multiple levels of the interconnection hierarchy. We evaluate GraphCube by applying it to fundamental graph operations performed on synthetic and real-world graph datasets. Our evaluation used up to 79,024 computing nodes and 1.2+ million processor cores. Our large-scale experiments show that GraphCube outperforms state-of-the-art parallel graph processing methods in throughput and scalability. Furthermore, GraphCube outperformed the top-ranked systems on the Graph 500 list.

Supplementary Material

PDF File (p160-gan-supp.pdf)

Supplemental material.

Download
1.50 MB

References

[1]

Zainab Abbas, Vasiliki Kalavri, Paris Carbone, and Vladimir Vlassov. 2018. Streaming graph partitioning: an experimental study. Proceedings of the VLDB Endowment 11, 11 (2018), 1590--1603.

Digital Library

[2]

Soramichi Akiyama. 2020. Assessing Impact of Data Partitioning for Approximate Memory in C/C++ Code. arXiv preprint arXiv:2004.01637 (2020).

[3]

Scott Beamer, Krste Asanovic, and David Patterson. 2012. Direction-optimizing breadth-first search. In SC'12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1--10.

Digital Library

[4]

Huanqi Cao, Yuanwei Wang, Haojie Wang, Heng Lin, Zixuan Ma, Wanwang Yin, and Wenguang Chen. 2022. Scaling graph traversal to 281 trillion edges with 40 million cores. In Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 234--245.

Digital Library

[5]

Donglin Chen, Jianbin Fang, Shizhao Chen, Chuanfu Xu, and Zheng Wang. 2019. Optimizing sparse matrix-vector multiplications on an armv8-based many-core architecture. International Journal of Parallel Programming 47, 3 (2019), 418--432.

Digital Library

[6]

Donglin Chen, Jianbin Fang, Chuanfu Xu, Shizhao Chen, and Zheng Wang. 2020. Characterizing scalability of sparse matrix-vector multiplications on phytium ft-2000+. International Journal of Parallel Programming 48, 1 (2020), 80--97.

Digital Library

[7]

R. Chen, J. Shi, Y. Chen, and H. Chen. 2015. Powerlyra: Differentiated graph computation and partitioning on skewed graphs. European Conference on Computer Systems (2015), 1--15.

[8]

Rong Chen, Jiaxin Shi, Yanzhe Chen, Binyu Zang, Haibing Guan, and Haibo Chen. 2018. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs. ACM Trans. Parallel Comput. 5, 3 (2018), 13:1--13:39.

Digital Library

[9]

Rong Chen, Jiaxin Shi, Yanzhe Chen, Binyu Zang, Haibing Guan, and Haibo Chen. 2019. Powerlyra: Differentiated graph computation and partitioning on skewed graphs. ACM Transactions on Parallel Computing (TOPC) 5, 3 (2019), 1--39.

Digital Library

[10]

Yongzhi Chen and Yuefan Deng. 2009. A detailed analysis of communication load balance on BlueGene supercomputer. Computer physics communications 180, 8 (2009), 1251--1258.

[11]

William James Dally and Brian Patrick Towles. 2004. Principles and practices of interconnection networks. Elsevier.

Digital Library

[12]

Roshan Dathathri, Gurbinder Gill, Loc Hoang, Hoang-Vu Dang, Alex Brooks, Nikoli Dryden, Marc Snir, and Keshav Pingali. 2018. Gluon: A communication-optimizing substrate for distributed heterogeneous graph analytics. In Proceedings of the 39th ACM SIGPLAN conference on programming language design and implementation. 752--768.

Digital Library

[13]

Timothy A Davis. 2006. Direct methods for sparse linear systems. SIAM.

[14]

Jack Dongarra. 2020. Report on the Fujitsu Fugaku system. University of Tennessee-Knoxville Innovative Computing Laboratory, Tech. Rep. ICLUT-20-06 (2020).

[15]

Wenfei Fan, Tao He, Longbin Lai, Xue Li, Yong Li, Zhao Li, Zhengping Qian, Chao Tian, Lei Wang, Jingbo Xu, et al. 2021. GraphScope: a unified engine for big graph processing. Proceedings of the VLDB Endowment 14, 12 (2021), 2879--2892.

Digital Library

[16]

Wenfei Fan, Muyang Liu, Chao Tian, Ruiqi Xu, and Jingren Zhou. 2020. Incrementalization of graph partitioning algorithms. Proceedings of the VLDB Endowment 13, 8 (2020), 1261--1274.

Digital Library

[17]

Wenfei Fan, Ruiqi Xu, Qiang Yin, Wenyuan Yu, and Jingren Zhou. 2022. Application-driven graph partitioning. The VLDB Journal (2022), 1--24.

[18]

Jianbin Fang, Peng Zhang, Chun Huang, Tao Tang, Kai Lu, Ruibo Wang, and Zheng Wang. 2022. Programming Bare-Metal Accelerators with Heterogeneous Threading Models: A Case Study of Matrix-3000. arXiv preprint arXiv:2210.12230 (2022).

[19]

Xing Feng, Lijun Chang, Xuemin Lin, Lu Qin, and Wenjie Zhang. 2016. Computing connected components with linear communication cost in pregel-like systems. In 2016 IEEE 32nd International Conference on Data Engineering (ICDE). IEEE, 85--96.

[20]

Zhisong Fu, Michael Personick, and Bryan Thompson. 2014. Map-Graph: A high level API for fast development of high performance graph analytics on GPUs. In Proceedings of workshop on GRAph data management experiences and systems. 1--6.

Digital Library

[21]

Pablo Fuentes, Mariano Benito, Enrique Vallejo, José Luis Bosque, Ramön Beivide, Andreea Anghel, Germán Rodríguez, Mitch Gusat, Cyriel Minkenberg, and Mateo Valero. 2017. A scalable synthetic traffic model of Graph500 for computer networks analysis. Concurrency and Computation: Practice and Experience 29, 24 (2017), e4231.

[22]

Xinbiao Gan, Yiming Zhang, Ruibo Wang, Tiejun Li, Tiaojie Xiao, Ruigeng Zeng, Jie Liu, and Kai Lu. 2021. TianheGraph: Customizing Graph Search for Graph500 on Tianhe Supercomputer. IEEE Transactions on Parallel and Distributed Systems (2021).

[23]

Xinbiao Gan, Yiming Zhang, Ruigeng Zeng, Jie Liu, Ruibo Wang, Tiejun Li, Li Chen, and Kai Lu. 2022. XTree: Traversal-Based Partitioning for Extreme-Scale Graph Processing on Supercomputers. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE, 2046--2059.

[24]

Tao Gao, Yutong Lu, Baida Zhang, and Guang Suo. 2014. Using the intel many integrated core to accelerate graph traversal. The International journal of high performance computing applications 28, 3 (2014), 255--266.

[25]

Sayan Ghosh, Nathan R Tallent, and Mahantesh Halappanavar. 2021. Characterizing Performance of Graph Neighborhood Communication Patterns. IEEE Transactions on Parallel and Distributed Systems 33, 4 (2021), 915--928.

[26]

Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation. 17--30.

Digital Library

[27]

Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX conference on Operating Systems Design and Implementation. 599--613.

[28]

Samuel Grossman, Heiner Litz, and Christos Kozyrakis. 2018. Making pull-based graph processing performant. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 246--260.

Digital Library

[29]

http://graph500.org/. 2021. The Graph 500 List. https://graph500.org/ Last accessed 03 March 2022.

[30]

Eu Inc. 2022. url.eu-2015. https://law.di.unimi.it/webdata/eu-2015/ Last accessed 03 December 2022.

[31]

Twitter Inc. 2021. twitter-2010. https://law.di.unimi.it/webdata/twitter-2010/ Last accessed 03 December 2021.

[32]

Keita Iwabuchi, Hitoshi Sato, Yuichiro Yasui, Katsuki Fujisawa, and Satoshi Matsuoka. 2014. NVM-based hybrid BFS with memory efficient data structure. In 2014 IEEE International Conference on Big Data (Big Data). IEEE, 529--538.

[33]

George Karypis and Vipin Kumar. 1995. METIS-unstructured graph partitioning and sparse matrix ordering system, version 2.0. (1995).

[34]

George Karypis and Vipin Kumar. 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on scientific Computing 20, 1 (1998), 359--392.

Digital Library

[35]

George Karypis, Kirk Schloegel, and Vipin Kumar. 1997. Parmetis: Parallel graph partitioning and sparse matrix ordering library. (1997).

[36]

John Kim, Wiliam J Dally, Steve Scott, and Dennis Abts. 2008. Technology-driven, highly-scalable dragonfly topology. In 2008 International Symposium on Computer Architecture. IEEE, 77--88.

Digital Library

[37]

Deyu Kong, Xike Xie, and Zhuoxu Zhang. 2022. Clustering-based Partitioning for Large Web Graphs. arXiv preprint arXiv:2201.00472 (2022).

[38]

Dongsheng Li, Yiming Zhang, Jinyan Wang, and KianLee Tan. 2019. TopoX: Topology Refactorization for Efficient Graph Partitioning and Processing. PVLDB 12, 8 (2019), 891--905.

Digital Library

[39]

Dongsheng Li, Yiming Zhang, Jinyan Wang, and Kian-Lee Tan. 2019. TopoX: Topology refactorization for efficient graph partitioning and processing. Proceedings of the VLDB Endowment 12, 8 (2019), 891--905.

Digital Library

[40]

Z Li, C Wu, and Y Li. 2021. FEP-based large-scale virtual screening for effective drug discovery against COVID-19. In Int. Conf. High Performance Computing, Networking, Storage, and Analysis.

[41]

Xiangke LIAO, Liquan XIAO, Canqun YANG, and Yutong LU. [n.d.]. MilkyWay-2 supercomputer: system and application. ([n. d.]).

[42]

Xiang-Ke Liao, Zheng-Bin Pang, Ke-Fei Wang, Yu-Tong Lu, Min Xie, Jun Xia, De-Zun Dong, and Guang Suo. 2015. High performance interconnect network for Tianhe system. Journal of Computer Science and Technology 30, 2 (2015), 259--272.

[43]

Heng Lin, Xiongchao Tang, Bowen Yu, Youwei Zhuo, Wenguang Chen, Jidong Zhai, Wanwang Yin, and Weimin Zheng. 2017. Scalable graph traversal on sunway taihulight with ten million cores. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 635--645.

[44]

Heng Lin, Xiaowei Zhu, Bowen Yu, Xiongchao Tang, Wei Xue, Wenguang Chen, Lufei Zhang, Torsten Hoefler, Xiaosong Ma, Xin Liu, et al. 2018. Shentu: processing multi-trillion edge graphs on millions of cores in seconds. In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 706--716.

Digital Library

[45]

Yucheng Low. 2013. Graphlab: A distributed abstraction for large scale machine learning. University of California (2013).

[46]

Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A Framework for Machine Learning in the Cloud. PVLDB 5, 8 (2012), 716--727.

Digital Library

[47]

Kai Lu, Yaohua Wang, Yang Guo, Chun Huang, Sheng Liu, Ruibo Wang, Jianbin Fang, Tao Tang, Zhaoyun Chen, Biwei Liu, et al. 2022. MT-3000: a heterogeneous multi-zone processor for HPC. CCF Transactions on High Performance Computing (2022), 1--15.

[48]

Meilian Lu, Zhenglin Zhang, Zhihe Qu, and Yu Kang. 2018. LPANNI: Overlapping community detection using label propagation in large-scale complex networks. IEEE Transactions on Knowledge and Data Engineering 31, 9 (2018), 1736--1749.

Digital Library

[49]

Yutong Lu. 2019. Paving the way for China exascale computing. CCF Transactions on High Performance Computing 1, 2 (2019), 63--72.

[50]

Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2009. Pregel: a system for large-scale graph processing. Sigmod (2009), 135--146.

[51]

Ruben Mayer and Hans-Arno Jacobsen. 2021. Hybrid edge partitioner: partitioning large power-law graphs under memory constraints. In Proceedings of the 2021 International Conference on Management of Data. 1289--1302.

Digital Library

[52]

Andrey Molyakov. 2019. Age of Great Chinese Dragon: Supercomputer Centers and High Performance Computing. Journal of Electrical and Electronic Engineering 7, 4 (2019), 87--94.

[53]

Masahiro Nakao, Koji Ueno, Katsuki Fujisawa, Yuetsu Kodama, and Mitsuhisa Sato. 2020. Performance Evaluation of Supercomputer Fugaku using Breadth-First Search Benchmark in Graph500. In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 408--409.

[54]

Masahiro Nakao, Koji Ueno, Katsuki Fujisawa, Yuetsu Kodama, and Mitsuhisa Sato. 2021. Performance of the supercomputer fugaku for breadth-first search in graph500 benchmark. In High Performance Computing: 36th International Conference, ISC High Performance 2021, Virtual Event, June 24--July 2, 2021, Proceedings 36. Springer, 372--390.

Digital Library

[55]

Mahdi Nikdast, Jiang Xu, Luan HK Duong, Xiaowen Wu, Zhehui Wang, Xuan Wang, and Zhe Wang. 2014. Fat-tree-based optical interconnection networks under crosstalk noise constraint. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 23, 1 (2014), 156--169.

Digital Library

[56]

Joel Nishimura and Johan Ugander. 2013. Restreaming graph partitioning: simple versatile algorithms for advanced balancing. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 1106--1114.

Digital Library

[57]

Anil Pacaci and M Tamer Özsu. 2019. Experimental analysis of streaming algorithms for graph partitioning. In Proceedings of the 2019 International Conference on Management of Data. 1375--1392.

Digital Library

[58]

Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In ACM Sigplan Notices, Vol. 48. ACM, 135--146.

Digital Library

[59]

George M Slota, Cameron Root, Karen Devine, Kamesh Madduri, and Sivasankaran Rajamanickam. 2020. Scalable, multi-constraint, complex-objective graph partitioning. IEEE Transactions on Parallel and Distributed Systems 31, 12 (2020), 2789--2801.

Digital Library

[60]

Hari Subramoni, Albert Mathews Augustine, Mark Arnold, Jonathan Perkins, Xiaoyi Lu, Khaled Hamidouche, and Dhabaleswar K Panda. 2016. INAM 2: InfiniBand Network Analysis and Monitoring with MPI. In International Conference on High Performance Computing. Springer, 300--320.

[61]

Toyotaro Suzumura, Koji Ueno, Hitoshi Sato, Katsuki Fujisawa, and Satoshi Matsuoka. 2011. Performance characteristics of Graph500 on large-scale distributed environment. In 2011 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 149--158.

Digital Library

[62]

Koji Ueno and Toyotaro Suzumura. 2012. 2d partitioning based graph search for the graph500 benchmark. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. IEEE, 1925--1931.

Digital Library

[63]

Koji Ueno and Toyotaro Suzumura. 2012. Highly scalable graph search for the graph500 benchmark. In Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing. 149--160.

Digital Library

[64]

Koji Ueno, Toyotaro Suzumura, Naoya Maruyama, Katsuki Fujisawa, and Satoshi Matsuoka. 2016. Extreme scale breadth-first search on supercomputers. In 2016 IEEE International Conference on Big Data (Big Data). IEEE, 1040--1047.

[65]

Carnegie Mellon University. 2021. ClueWeb12 Dataset. https://lemurproject.org/clueweb12/ Last accessed 03 December 2021.

[66]

Erik Vermij, Leandro Fiorin, Christoph Hagleitner, and Koen Bertels. 2017. Boosting the efficiency of HPCG and Graph500 with near-data processing. In 2017 46th International Conference on Parallel Processing (ICPP). IEEE, 31--40.

[67]

Ruibo Wang, Kai Lu, Juan Chen, Wenzhe Zhang, Jinwen Li, Yuan Yuan, Pingjing Lu, Libo Huang, Shengguo Li, and Xiaokang Fan. 2020. Brief introduction of TianHe exascale prototype system. Tsinghua Science and Technology 26, 3 (2020), 361--369.

[68]

Yuanwei Wang, Huanqi Cao, Zixuan Ma, Wanwang Yin, and Wenguang Chen. 2022. Scaling graph 500 SSSP to 140 trillion edges with over 40 million cores. In 2022 SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC). IEEE Computer Society, 248--262.

[69]

Min Xie, Yutong Lu, Kefei Wang, Lu Liu, Hongjia Cao, et al. 2011. Tianhe-1a interconnect and message-passing services. IEEE Micro 32, 1 (2011), 8--20.

[70]

Xue-Jun Yang, Xiang-Ke Liao, Kai Lu, Qing-Feng Hu, Jun-Qiang Song, and Jin-Shu Su. 2011. The TianHe-1A supercomputer: its hardware and software. Journal of computer science and technology 26, 3 (2011), 344--351.

[71]

Yuichiro Yasui and Katsuki Fujisawa. 2015. Fast and scalable NUMA-based thread parallel breadth-first search. In 2015 International Conference on High Performance Computing & Simulation (HPCS). IEEE, 377--385.

[72]

Yuichiro Yasui, Katsuki Fujisawa, and Kazushige Goto. 2013. NUMA-optimized parallel breadth-first search on multicore single-node system. In 2013 IEEE International Conference on Big Data. IEEE, 394--402.

[73]

Andy Yoo, Edmond Chow, Keith Henderson, William McLendon, Bruce Hendrickson, and Umit Catalyurek. 2005. A scalable distributed parallel breadth-first search algorithm on BlueGene/L. In SC'05: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing. IEEE, 25--25.

Digital Library

[74]

Jason Jongjin Yoo and Alan E Willner. 2001. A Performance and Implementation Comparison of Bidirectional and Dual Bus 2-D WDM Multiple-Plane Optical Interconnections with Row-ColumnMultihop Network Structures. Journal of lightwave technology 19, 6 (2001), 801.

[75]

Xin You, Hailong Yang, Zhongzhi Luan, Yi Liu, and Depei Qian. 2019. Performance evaluation and analysis of linear algebra kernels in the prototype tianhe-3 cluster. In Asian Conference on Supercomputing Frontiers. Springer, 86--105.

[76]

Jeffrey Young, Julian Romera, Matthias Hauck, and Holger Fröning. 2016. Optimizing communication for a 2D-partitioned scalable BFS. In 2016 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 1--7.

[77]

Chenglong Zhang. 2020. A New Perspective of Graph Data and A Generic and Efficient Method for Large Scale Graph Data Traversal. arXiv preprint arXiv:2009.07463 (2020).

[78]

Hongyang Zhang, Peter Lofgren, and Ashish Goel. 2016. Approximate personalized pagerank on dynamic graphs. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1315--1324.

Digital Library

[79]

Yiming Zhang, Haonan Wang, Menghan Jia, Jinyan Wang, Dong sheng Li, Guangtao Xue, and K. Tan. 2020. TopoX: Topology Refactorization for Minimizing Network Communication in Graph Computations. IEEE/ACM Transactions on Networking 28 (2020), 2768--2782.

Digital Library

[80]

Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A Computation-Centric Distributed Graph Processing System. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2--4, 2016, Kimberly Keeton and Timothy Roscoe (Eds.). USENIX Association, 301--316. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/zhu

Cited By

Gan X(2024)GraphService: Topology-aware Constructor for Large-scale Graph ApplicationsACM Transactions on Architecture and Code Optimization10.1145/3689341Online publication date: 17-Aug-2024
https://dl.acm.org/doi/10.1145/3689341
Arai JNakao MInoue YTeranishi KUeno KYamamura KSato MFujisawa K(2024)Doubling Graph Traversal Efficiency to 198 TeraTEPS on the Supercomputer FugakuProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00107(1-14)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00107

Index Terms

GraphCube: Interconnection Hierarchy-aware Graph Processing
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms
  2. Parallel computing methodologies
    1. Parallel algorithms
      1. Massively parallel algorithms

Recommendations

FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph Processing
ICS '23: Proceedings of the 37th ACM International Conference on Supercomputing

As graph size (numbers of vertices and edges) is increasing from billions to trillions, efficient graph processing requires exascale computing clusters, which consist of hundreds of thousands of nodes connected via hierarchical networks with multiple ...
Partitioning of a graph into induced subgraphs not containing prescribed cliques
Abstract
Let K p be a complete graph of order p ≥ 2. A K p-free k-coloring of a graph H is a partition of V ( H ) into V 1 , V 2 … , V k such that H [ V i ] does not contain K p for each i ≤ k. In 1977 Borodin and Kostochka conjectured that any graph H ...
GraphService: Topology-aware Constructor for Large-scale Graph Applications
Graph-based services are becoming integrated into everyday life through graph applications and graph learning systems. While traditional graph processing approaches boast excellent throughput with millisecond-level processing time, the construction phase ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

PPoPP '24: Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming

March 2024

498 pages

ISBN:9798400704352

DOI:10.1145/3627535

Chair:
Michel Steuwer,
Program Chairs:
I-Ting Angelina Lee,
Milind Chabbi
Uber Technologies Inc.

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 February 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Natural Science Foundation of China

Conference

PPoPP '24

Sponsor:

PPoPP '24: 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming

March 2 - 6, 2024

Edinburgh, United Kingdom

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
394
Total Downloads

Downloads (Last 12 months)394
Downloads (Last 6 weeks)49

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Gan X(2024)GraphService: Topology-aware Constructor for Large-scale Graph ApplicationsACM Transactions on Architecture and Code Optimization10.1145/3689341Online publication date: 17-Aug-2024
https://dl.acm.org/doi/10.1145/3689341
Arai JNakao MInoue YTeranishi KUeno KYamamura KSato MFujisawa K(2024)Doubling Graph Traversal Efficiency to 198 TeraTEPS on the Supercomputer FugakuProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00107(1-14)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00107

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents