[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2676662.2676673acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

SCADAMAR: scalable and data-efficient internet MapReduce

Published: 08 December 2014 Publication History

Abstract

Recent developments of popular programming models, namely MapReduce, have raised the interest of running MapReduce applications over the large scale Internet. However, current data distribution techniques used in Internet wide computing platforms to distribute the high volumes of information, which are needed to run MapReduce jobs, are naive, and therefore need to be re-thought.
Thus, we present a computing platform called SCADAMAR that runs MapReduce jobs over the Internet and provides two new main contributions: i) improves data distribution by using the BitTorrent protocol to distribute all data, and ii) improves intermediate data availability by replicating tasks or data through nodes in order to avoid losing intermediate data and consequently preventing big delays on the MapReduce overall execution time.
Along with the design of our solution, we present an extensive set of performance results which confirm the usefulness of the above mentioned contributions, improved data distribution and availability, thus making our platform a feasible approach to run MapReduce jobs.

References

[1]
A. Alexandrov, M. Ibel, K. Schauser, and C. Scheiman. Superweb: towards a global web-based parallel computing infrastructure. In Parallel Processing Symposium, 1997. Proceedings., 11th International, pages 100--106, 1997.
[2]
D. Anderson. Boinc: a system for public-resource computing and storage. In Grid Computing, 2004. Proceedings. Fifth IEEE/ACM International Workshop on, pages 4--10, 2004.
[3]
A. Baratloo, M. Karaul, Z. Kedem, and P. Wijckoff. Charlotte: Metacomputing on the web. Future Generation Computer Systems, 15(5-6):559--570, 1999.
[4]
A. Chakravarti, G. Baumgartner, and M. Lauria. The organic grid: self-organizing computation on a peer-to-peer network. Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 35(3):373--384, 2005.
[5]
F. Costa, L. Veiga, and P. Ferreira. Internet-scale support for map-reduce processing. Journal of Internet Services and Applications, 4(1):1--17, 2013.
[6]
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1):107--113, Jan. 2008.
[7]
D. T. Fabrizio Marozzo and P. Trunfio. Adapting mapreduce for dynamic environments using a peer-to-peer model, 2008.
[8]
G. Fedak, C. Germain, V. Neri, and F. Cappello. Xtremweb: a generic global computing system. In Cluster Computing and the Grid, 2001. Proceedings. First IEEE/ACM International Symposium on, pages 582--587, 2001.
[9]
G. Fedak, H. He, and F. Cappello. Bitdew: A data management and distribution service with multi-protocol file transfer and metadata abstraction. Journal of Network and Computer Applications, 32(5):961--975, 2009. Next Generation Content Networks.
[10]
S. Y. Ko, I. Hoque, B. Cho, and I. Gupta. Making cloud intermediate data fault-tolerant. In Proceedings of the 1st ACM symposium on Cloud computing, pages 181--192. ACM, 2010.
[11]
H. Lin, X. Ma, J. Archuleta, W.-c. Feng, M. Gardner, and Z. Zhang. Moon: Mapreduce on opportunistic environments. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC '10, pages 95--106, New York, NY, USA, 2010. ACM.
[12]
V. Lo, D. Zappala, D. Zhou, Y. Liu, and S. Zhao. Cluster computing on the fly: P2p scheduling of idle cycles in the internet. In Peer-to-Peer Systems III, pages 227--236. Springer, 2005.
[13]
L. F. Sarmenta and S. Hirano. Bayanihan: building and studying web-based volunteer computing systems using java. Future Generation Computer Systems, 15(5-6):675--686, 1999.
[14]
M. Silberstein, A. Sharov, D. Geiger, and A. Schuster. Gridbot: execution of bags of tasks in multiple grids. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 11:1--11:12, New York, NY, USA, 2009. ACM.
[15]
B. Tang, M. Moca, S. Chevalier, H. He, and G. Fedak. Towards mapreduce for desktop grid computing. In P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2010 International Conference on, pages 193--200, 2010.
[16]
D. Thain, T. Tannenbaum, and M. Livny. Distributed computing in practice: the condor experience. Concurrency and Computation: Practice and Experience, 17(2-4):323--356, 2005.
[17]
T. White. Hadoop: the definitive guide. O'Reilly, 2012.

Cited By

View all
  • (2019)Jargon of Hadoop MapReduce scheduling techniques: a scientific categorizationThe Knowledge Engineering Review10.1017/S026988891800037134Online publication date: 15-Mar-2019
  • (2017)freeCycles - Efficient Multi-Cloud Computing PlatformJournal of Grid Computing10.1007/s10723-017-9414-215:4(501-526)Online publication date: 1-Dec-2017
  • (2015)Building Cloud Applications for Challenged NetworksEmbracing Global Computing in Emerging Economies10.1007/978-3-319-25043-4_1(1-10)Online publication date: 21-Nov-2015

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CCB '14: Proceedings of the 2nd International Workshop on CrossCloud Systems
December 2014
44 pages
ISBN:9781450332330
DOI:10.1145/2676662
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BOINC
  2. BitTorrent
  3. MapReduce
  4. cloud computing

Qualifiers

  • Research-article

Funding Sources

Conference

Middleware '14
Sponsor:

Acceptance Rates

CCB '14 Paper Acceptance Rate 7 of 14 submissions, 50%;
Overall Acceptance Rate 7 of 14 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Jargon of Hadoop MapReduce scheduling techniques: a scientific categorizationThe Knowledge Engineering Review10.1017/S026988891800037134Online publication date: 15-Mar-2019
  • (2017)freeCycles - Efficient Multi-Cloud Computing PlatformJournal of Grid Computing10.1007/s10723-017-9414-215:4(501-526)Online publication date: 1-Dec-2017
  • (2015)Building Cloud Applications for Challenged NetworksEmbracing Global Computing in Emerging Economies10.1007/978-3-319-25043-4_1(1-10)Online publication date: 21-Nov-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media