More Web Proxy on the site http://driver.im/

research-article

Public Access

A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing

Authors:

Pranav Chaudhari,

Mario SzegedyAuthors Info & Claims

Proceedings of the ACM on Measurement and Analysis of Computing Systems, Volume 1, Issue 1

Article No.: 14, Pages 1 - 21

https://doi.org/10.1145/3084451

Published: 13 June 2017 Publication History

Abstract

Graph edge partition models have recently become an appealing alternative to graph vertex partition models for distributed computing due to their flexibility in balancing loads and their performance in reducing communication cost [6, 16]. In this paper, we propose a simple yet effective graph edge partitioning algorithm. In practice, our algorithm provides good partition quality (and better than similar state-of-the-art edge partition approaches, at least for power-law graphs) while maintaining low partition overhead. In theory, previous work [6] showed that an approximation guarantee of O(d_max√ log n log k) apply to the graphs with m=Ω(k²) edges (k is the number of partitions). We further rigorously proved that this approximation guarantee hold for all graphs.

We show how our edge partition model can be applied to parallel computing. We draw our example from GPU program locality enhancement and demonstrate that the graph edge partition model does not only apply to distributed computing with many computer nodes, but also to parallel computing in a single computer node with a many-core processor.

References

[1]

Fusion of parallel array operations. pages 71--85, 2016.

Digital Library

[2]

A.-L. Barabási and R. Albert. Emergence of scaling in random networks. science, 286(5439):509--512, 1999.

[3]

N. Bell and M. Garland. Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, NVIDIA Corporation, Dec. 2008.

[4]

R. F. Boisvert, R. Pozo, K. A. Remington, R. F. Barrett, and J. Dongarra. Matrix market: a web resource for test matrix collections. In Quality of Numerical Software, pages 125--137, 1996.

Digital Library

[5]

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '08, pages 101--113, New York, NY, USA, 2008. ACM.

Digital Library

[6]

F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '14, pages 1456--1465, New York, NY, USA, 2014. ACM.

Digital Library

[7]

U. V. Catalyürek and C. Aykanat. PaToH: a multilevel hypergraph partitioning tool, version 3.0. Bilkent University, Department of Computer Engineering, Ankara, 6533, 1999.

[8]

S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC), IISWC '09, pages 44--54, Washington, DC, USA, 2009. IEEE Computer Society.

Digital Library

[9]

A. Corrigan, F. Camelli, R. Löhner, and J. Wallin. Running unstructured grid cfd solvers on modern graphics hardware. In 19th AIAA Computational Fluid Dynamics Conference, number AIAA 2009--4001, June 2009.

[10]

A. Corrigan, F. Camelli, R. Löhner, and J. Wallin. Running unstructured grid-based cfd solvers on modern graphics hardware. Int. J. Numer. Meth. Fluids, 66:221--229, 2011.

[11]

S. Dalton and N. Bell. CUSP: A C

[12]

templated sparse matrix library, 2014.

[13]

T. A. Davis and Y. Hu. The university of florida sparse matrix collection. ACM Trans. Math. Softw., 38(1):1:1--1:25, Dec. 2011.

Digital Library

[14]

C. Ding and K. Kennedy. Improving cache performance in dynamic applications through data and computation reorganization at run time. In Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation, PLDI '99, pages 229--241, New York, NY, USA, 1999. ACM.

Digital Library

[15]

C. Ding and K. Kennedy. Improving effective bandwidth through compiler enhancement of global cache reuse. Journal of Parallel and Distributed Computing, 64(1):108--134, 2004.

Digital Library

[16]

G. Even. Fast approximate graph partitioning algorithms. SIAM J. Comput., 28(6):2187--2214, Aug. 1999.

Digital Library

[17]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, volume 12, page 2, 2012.

Digital Library

[18]

B. Hendrickson and T. G. Kolda. Graph partitioning models for parallel computing. Parallel Comput., 26(12):1519--1534, Nov. 2000.

Digital Library

[19]

B. Hendrickson and T. G. Kolda. Partitioning rectangular and structurally unsymmetric sparse matrices for parallel processing. SIAM Journal on Scientific Computing, 21(6):2048--2072, 2000.

Digital Library

[20]

M. R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems. 1952.

[21]

G. Karypis and V. Kumar. Metis-unstructured graph partitioning and sparse matrix ordering system, version 2.0. 1995.

[22]

G. Karypis and V. Kumar. hMETIS 1.5: A hypergraph partitioning package. Technical report, Department of Computer Science, University of Minnesota, 1998.

[23]

R. Krauthgamer, J. S. Naor, and R. Schwartz. Partitioning graphs into balanced components. In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '09, pages 942--949, Philadelphia, PA, USA, 2009. Society for Industrial and Applied Mathematics.

Digital Library

[24]

Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed graphlab: a framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5(8):716--727, 2012.

Digital Library

[25]

G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 135--146. ACM, 2010.

Digital Library

[26]

NVIDIA. NVIDIA's Next Generation CUDA Compute Architecture: Kepler GK110. 2012.

[27]

NVIDIA. cuSPARSE library. 2014.

[28]

C. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic. Fennel: Streaming graph partitioning for massive scale graphs. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM '14, pages 333--342, New York, NY, USA, 2014. ACM.

Digital Library

[29]

B. Wu, Z. Zhao, E. Z. Zhang, Y. Jiang, and X. Shen. Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on gpu. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '13, pages 57--68, New York, NY, USA, 2013. ACM.

Digital Library

[30]

E. Z. Zhang, Y. Jiang, Z. Guo, K. Tian, and X. Shen. On-the-fly elimination of dynamic irregularities for gpu computing. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 369--380, New York, NY, USA, 2011. ACM.

Digital Library

Cited By

Du XWang MLu ZDuan QLiu YFeng JWang H(2024)HRCM: A Hierarchical Regularizing Mechanism for Sparse and Imbalanced Communication in Whole Human Brain SimulationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.338772035:6(1056-1073)Online publication date: Jun-2024
https://doi.org/10.1109/TPDS.2024.3387720
Gan XGuo JGuo PWu GSi JMei SLiu CLi TEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)GraphMedia: Communication-balanced Graph Searching for Billion-scale Social Media AccessProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3613828(8984-8993)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3613828
Gan XWu GZeng RSi JLiu JDong DGong CLiu CLi TGallivan KNikolopoulos DBeivide RGallopoulos E(2023)FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph ProcessingProceedings of the 37th International Conference on Supercomputing10.1145/3577193.3593729(240-250)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3577193.3593729
Show More Cited By

Index Terms

A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing

Recommendations

Balanced graph edge partition
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining

Balanced edge partition has emerged as a new approach to partition an input graph data for the purpose of scaling out parallel computations, which is of interest for several modern data analytics computation platforms, including platforms for iterative ...
A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing
SIGMETRICS '17 Abstracts: Proceedings of the 2017 ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems

Graph edge partition models have recently become an appealing alternative to graph vertex partition models for distributed computing due to both their flexibility in balancing loads and their performance in reducing communication cost.

In this paper, we ...
A Simple Yet Effective Balanced Edge Partition Model for Parallel Computing
Performance evaluation review

Graph edge partition models have recently become an appealing alternative to graph vertex partition models for distributed computing due to both their flexibility in balancing loads and their performance in reducing communication cost.

In this paper, we ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems

Proceedings of the ACM on Measurement and Analysis of Computing Systems Volume 1, Issue 1

June 2017

712 pages

EISSN:2476-1249

DOI:10.1145/3107080

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2017

Published in POMACS Volume 1, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
490
Total Downloads

Downloads (Last 12 months)99
Downloads (Last 6 weeks)10

Reflects downloads up to 10 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Du XWang MLu ZDuan QLiu YFeng JWang H(2024)HRCM: A Hierarchical Regularizing Mechanism for Sparse and Imbalanced Communication in Whole Human Brain SimulationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.338772035:6(1056-1073)Online publication date: Jun-2024
https://doi.org/10.1109/TPDS.2024.3387720
Gan XGuo JGuo PWu GSi JMei SLiu CLi TEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)GraphMedia: Communication-balanced Graph Searching for Billion-scale Social Media AccessProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3613828(8984-8993)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3613828
Gan XWu GZeng RSi JLiu JDong DGong CLiu CLi TGallivan KNikolopoulos DBeivide RGallopoulos E(2023)FT-topo: Architecture-Driven Folded-Triangle Partitioning for Communication-efficient Graph ProcessingProceedings of the 37th International Conference on Supercomputing10.1145/3577193.3593729(240-250)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3577193.3593729
Schlag SHeuer TGottesbüren LAkhremtsev YSchulz CSanders P(2023)High-Quality Hypergraph PartitioningACM Journal of Experimental Algorithmics10.1145/352909027(1-39)Online publication date: 10-Feb-2023
https://dl.acm.org/doi/10.1145/3529090
Liu YDu XLu ZDuan QFeng JWang MWu J(2022)Regularizing Sparse and Imbalanced Communications for Voxel-based Brain Simulations on SupercomputersProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3545019(1-11)Online publication date: 29-Aug-2022
https://dl.acm.org/doi/10.1145/3545008.3545019
Tiernan EHodges B(2022)A Topological Approach to Partitioning Flow Networks for Parallel SimulationJournal of Computing in Civil Engineering10.1061/(ASCE)CP.1943-5487.000102036:4Online publication date: Jul-2022
https://doi.org/10.1061/(ASCE)CP.1943-5487.0001020
Lakhotia KKannan RPati SPrasanna V(2020)GPOPACM Transactions on Parallel Computing10.1145/33809427:1(1-24)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3380942
Zhang SLiang YGe JXiao MWu J(2020)Provably Efficient Resource Allocation for Edge Service Entities Using HermesIEEE/ACM Transactions on Networking10.1109/TNET.2020.2989307(1-14)Online publication date: 2020
https://doi.org/10.1109/TNET.2020.2989307
Blum JLi RStorandt S(2020)FISSION: A Practical Algorithm for Computing Minimum Balanced Node SeparatorsCombinatorial Optimization and Applications10.1007/978-3-030-64843-5_55(817-832)Online publication date: 11-Dec-2020
https://dl.acm.org/doi/10.1007/978-3-030-64843-5_55
Avdiukhin DPupyrev SYaroslavtsev G(2019)Multi-dimensional balanced graph partitioning via projected gradient descentProceedings of the VLDB Endowment10.14778/3324301.332430712:8(906-919)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.14778/3324301.3324307
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents