[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/IPDPS.2011.260guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

iMapReduce: A Distributed Computing Framework for Iterative Computation

Published: 16 May 2011 Publication History

Abstract

Relational data are pervasive in many applications such as data mining or social network analysis. These relational data are typically massive containing at least millions or hundreds of millions of relations. This poses demand for the design of distributed computing frameworks for processing these data on a large cluster. MapReduce is an example of such a framework. However, many relational data based applications typically require parsing the relational data iteratively and need to operate on these data through many iterations. MapReduce lacks built-in support for the iterative process. This paper presents iMapReduce, a framework that supports iterative processing. iMapReduce allows users to specify the iterative operations with map and reduce functions, while supporting the iterative processing automatically without the need of users' involvement. More importantly, iMapReduce significantly improves the performance of iterative algorithms by (1) reducing the overhead of creating a new task in every iteration, (2) eliminating the shuffling of the static data in the shuffle stage of MapReduce, and (3) allowing asynchronous execution of each iteration, {it i.e.,} an iteration can start before all tasks of a previous iteration have finished. We implement iMapReduce based on Apache Hadoop, and show that iMapReduce can achieve a factor of 1.2 to 5 speedup over those implemented on MapReduce for well-known iterative algorithms.

Cited By

View all
  • (2019)An effective framework for asynchronous incremental graph processingFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-018-7443-z13:3(539-551)Online publication date: 25-May-2019
  • (2017)The Parallelization of Back Propagation Neural Network in MapReduce and SparkInternational Journal of Parallel Programming10.1007/s10766-016-0401-145:4(760-779)Online publication date: 1-Aug-2017
  • (2016)MrsProceedings of the 6th Workshop on Python for High-Performance and Scientific Computing10.5555/3019083.3019093(76-85)Online publication date: 13-Nov-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
IPDPSW '11: Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
May 2011
2107 pages
ISBN:9780769545776

Publisher

IEEE Computer Society

United States

Publication History

Published: 16 May 2011

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2019)An effective framework for asynchronous incremental graph processingFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-018-7443-z13:3(539-551)Online publication date: 25-May-2019
  • (2017)The Parallelization of Back Propagation Neural Network in MapReduce and SparkInternational Journal of Parallel Programming10.1007/s10766-016-0401-145:4(760-779)Online publication date: 1-Aug-2017
  • (2016)MrsProceedings of the 6th Workshop on Python for High-Performance and Scientific Computing10.5555/3019083.3019093(76-85)Online publication date: 13-Nov-2016
  • (2016)The Six Pillars for Building Big Data Analytics EcosystemsACM Computing Surveys10.1145/296314349:2(1-36)Online publication date: 2-Aug-2016
  • (2016)TomusBlobsConcurrency and Computation: Practice & Experience10.1002/cpe.303428:4(950-976)Online publication date: 25-Mar-2016
  • (2014)To overlap or not to overlapProceedings of the 5th International Workshop on Data-Intensive Computing in the Clouds10.1109/DataCloud.2014.7(9-16)Online publication date: 16-Nov-2014
  • (2014)A comprehensive view of Hadoop research-A systematic literature reviewJournal of Network and Computer Applications10.1016/j.jnca.2014.07.02246:C(1-25)Online publication date: 1-Nov-2014
  • (2014)An adaptive switching scheme for iterative computing in the cloudFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-014-3472-48:6(872-884)Online publication date: 1-Dec-2014
  • (2013)Optimization for iterative queries on MapReduceProceedings of the VLDB Endowment10.14778/2732240.27322437:4(241-252)Online publication date: 1-Dec-2013
  • (2013)MammothProceedings of the 2013 ACM Cloud and Autonomic Computing Conference10.1145/2494621.2494633(1-10)Online publication date: 9-Aug-2013
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media