[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/CCGrid.2013.85acmotherconferencesArticle/Chapter ViewAbstractPublication PagesccgridConference Proceedingsconference-collections
research-article

A scalable implementation of a mapreduce-based graph processing algorithm for large-scale heterogeneous supercomputers

Published: 13 May 2013 Publication History

Abstract

Fast processing for extremely large-scale graph is becoming increasingly important in various domains such as health care, social networks, intelligence, system biology, and electric power grids. The GIM-V algorithm based on MapReduce programing model is designed as a general graph processing method for supporting petabyte-scale graph data. On the other hand, recent large-scale data-intensive computing systems tend to employ GPU accelerators to gain good peak performance and high memory bandwidth; however, the validity of acceleration, including optimization techniques, of the GIM-V algorithm using GPUs is an open problem. To address the problem, we implemented a multi-GPU-based GIM-V application with load balance optimization between GPU devices. Our implementation extends the existing MapReduce library for supporting multi-GPU-environments using the MPI library and optimizes load balance between GPU devices by employing task scheduling-based graph partitioning. We conducted our implementation on the TSUBAME2.0 supercomputer using 256 nodes (6144 hyper-threaded CPU cores, 768 GPUs). The results exhibit that our GPU-based implementation performed 87.04 ME/s on 230 (1.07 billion) vertices and 234 (17.2 billion) edges, and 1.52 times faster than the CPU-based naive implementation with 229 vertices and 233 edges. We also studied the performance characteristics of our implementation and load balance optimization technique.

References

[1]
"Facebook." {Online}. Available: http://www.facebook.com
[2]
J. A. Ang, B. W. Barrett, K. B. Wheeler, and R. C. Murphy, "Introducing the graph 500," May 2010.
[3]
J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," OSDI '04, Sixth Symposium on Operating System Design and Implementation, pp. 137--150, 2004.
[4]
U. Kang, C. E. Tsourakakis, and C. Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System- Implementation and Observations," 2009.
[5]
A. Bialecki, M. Cordova, D. Cutting, and O. O'Malley, "Hadoop: a framework for running applications on large clusters built of commodity hardware," 2005. {Online}. Available: http://lucene.apache.org/hadoop
[6]
"Amazon elastic compute cloud (amazon ec2)." {Online}. Available: http://aws.amazon.com/ec2/
[7]
K. Shirahata, H. Sato, T. Suzumura, and S. Matsuoka, "A GPU Implementation of Generalized Graph Processing Algorithm GIM-V," in 2012 International Workshop on Parallel Algorithm and Parallel Software (IWPAPS12), September 2012.
[8]
J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani, "Kronecker graphs: An approach to modeling networks," J. Mach. Learn. Res., vol. 11, pp. 985--1042, Mar. 2010. {Online}. Available: http://dl.acm.org/citation.cfm?id=1756006.1756039
[9]
S. Brin and L. Page, "The Anatomy of a Large-Scale Hypertextual Web Search Engine," in Seventh International World-Wide Web Conference (WWW 1998), 1998. {Online}. Available: http://ilpubs.stanford.edu:8090/361/
[10]
B. He, W. Fang, Q. Luo, N. K.Govindaraju, and T. Wang, "Mars: A mapreduce framework on graphics processors," Parallel Architectures and Compilation Techniques, pp. 260--269, 2008.
[11]
W. Fang, B. He, Q. Luo, and N. K. Govindaraju, "Mars: Accelerating MapReduce with Graphics Processors," IEEE Transactions on Parallel and Distributed Systems, vol. 22, pp. 608--620, 2011.
[12]
N. Govindaraju, J. Gray, R. Kumar, and D. Manocha, "GPUTeraSort: high performance graphics coprocessor sorting for large database management," SIGMOD, 2006.
[13]
R. L. Graham, "Bounds on multiprocessing anomalies and related packing algorithms," in Proceedings of the May 16--18, 1972, spring joint computer conference, ser. AFIPS '72 (Spring). New York, NY, USA: ACM, 1972, pp. 205--217. {Online}. Available
[14]
J. Bruno, E. G. Coffman, Jr., and R. Sethi, "Scheduling independent tasks to reduce mean finishing time," Commun. ACM, vol. 17, no. 7, pp. 382--387, Jul. 1974. {Online}. Available
[15]
C. Chekuri, S. Khanna, and A. Zhu, "Algorithms for minimizing weighted flow time," in Proceedings of the thirty-third annual ACM symposium on Theory of computing, ser. STOC '01. New York, NY, USA: ACM, 2001, pp. 84--93. {Online}. Available
[16]
D. A. Bader, J. Berry, S. Kahan, R. Murphy, and E. J. Riedy. The graph 500 list. Graph500.org. {Online}. Available: http://www.graph500.org/
[17]
J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani, "Kronecker graphs: An approach to modeling networks," J. Mach. Learn. Res., vol. 11, pp. 985--1042, March 2010. {Online}. Available: http://portal.acm.org/citation.cfm?id=1756006.1756039
[18]
S. Matsuoka, T. Endo, N. Maruyama, H. Sato, and S. Takizawa, "The total picture of tsubame2.0," Tsubame e-Science Journal, vol. 1, pp. 2 -- 4, 2010.
[19]
P. Harish and P. J. Narayanan, "Accelerating large graph algorithms on the GPU using CUDA," in Proceedings of the 14th international conference on High performance computing, ser. HiPC'07. Berlin, Heidelberg: Springer-Verlag, 2007, pp. 197--208. {Online}. Available: http://portal.acm.org/citation.cfm?id=1782174.1782200
[20]
J. A. Stuart and J. D. Owens, "Multi-GPU MapReduce on GPU Clusters," in Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium. IEEE, May 2011.
[21]
S. J. Plimpton and K. D. Devine, "MapReduce in MPI for Large-scale Graph Algorithms," Parallel Computing, vol. In Press, no. February, pp. 1--39, 2011. {Online}. Available: http://linkinghub.elsevier.com/retrieve/pii/S0167819111000172
[22]
B. Catanzaro, N. Sundaram, and K. Keutzer, "A map reduce framework for programming graphics processors," in In Workshop on Software Tools for MultiCore Systems, 2008.
[23]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, "Pregel: a system for large-scale graph processing," in Proceedings of the 28th ACM symposium on Principles ofdistributed computing, ser. PODC '09. New York, NY, USA: ACM, 2009, pp. 6--6.
[24]
J. Chhugani, N. Satish, C. Kim, J. Sewall, and P. Dubey, "Fast and efficient graph traversal algorithm for cpus: Maximizing single-node efficiency," in Parallel Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, may 2012, pp. 378--389.
[25]
A. Buluç and K. Madduri, "Parallel breadth-first search on distributed memory systems," in Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '11. New York, NY, USA: ACM, 2011, pp. 65:1--65:12. {Online}. Available

Cited By

View all
  • (2014)Extreme big data processing in large-scale graph analytics and billion-scale social simulationProceedings of the 5th ACM/SPEC international conference on Performance engineering10.1145/2568088.2576096(1-2)Online publication date: 22-Mar-2014
  1. A scalable implementation of a mapreduce-based graph processing algorithm for large-scale heterogeneous supercomputers

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          CCGRID '13: Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing
          May 2013
          710 pages
          ISBN:9780768549965
          • General Chair:
          • Dick Epema

          Publisher

          IEEE Press

          Publication History

          Published: 13 May 2013

          Check for updates

          Author Tags

          1. GPGPU
          2. large-scale graph processing
          3. mapreduce

          Qualifiers

          • Research-article

          Conference

          CCGrid '13

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 05 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2014)Extreme big data processing in large-scale graph analytics and billion-scale social simulationProceedings of the 5th ACM/SPEC international conference on Performance engineering10.1145/2568088.2576096(1-2)Online publication date: 22-Mar-2014

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media