Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter

251 Accesses
Explore all metrics

Abstract

Job scheduling of MapReduce is a research hot spot, especially on the heterogeneous datacenter. Huge energy consumption and operating costs are key challenges. Most of the previous work only considers the scheduling optimization of a single job. In this paper, we take multiple jobs of MapReduce as research objects and focus on the goal of “jointly optimizing the scheduling time, job costs and energy consumption.” For that, an energy- and locality-efficient MapReduce multi-job scheduling algorithm is developed for the heterogeneous datacenter. Firstly, we use rack as the basic unit of resource in job scheduling to reduce data communication between jobs and to facilitate energy savings. Secondly, according to the capacity of heterogeneous rack, we design a multi-job pre-mapping method to optimize the execution order of jobs and jointly optimize the scheduling time, job costs and energy consumption. Based this pre-mapping method, we can assign one job to the virtual machine on the same rack, so as to minimize the amount of online rack. This centralized mapping strategy is very helpful to save energy and reduce data transmission of jobs. Thirdly, the map and reduce tasks of a job will be divided into multiple task groups for parallel execution, thereby further reducing data communication and energy consumption. Finally, a lot of experimental results prove the advantages of our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) Strategy

Article 13 January 2017

Energy Efficiency MapReduce Job Scheduling of Shuffle and Reduce Phases in Data Center

HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework

Article Open access 30 November 2019

References

Hashem IAT, Anuar NB, Marjani M et al (2018) MapReduce scheduling algorithms: a review. J Supercomput 2018(1):1–31
Google Scholar
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Article Google Scholar
Dahiphale D, Karve R, Vasilakos AV et al (2014) An advanced mapreduce: cloud mapreduce, enhancements and applications. IEEE Trans Netw Serv Manag 11(1):101–115
Article Google Scholar
Mashayekhy L, Nejad MM, Grosu D et al (2015) Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans Parallel Distrib Syst 26(10):2720–2733
Article Google Scholar
Bampis E, Chau V, Letsios D, Lucarelli G, Milis I, Zois G (2014) Energy efficient scheduling of mapreduce jobs. In: Euro-Par 2014 parallel processing. Springer
Wang J, Li X, Yang J (2015) Energy-aware task scheduling of mapreduce cluster. In: 2015 international conference on service science (ICSS)
Maheshwari N, Nanduri R, Varma V (2012) Dynamic energy efficient data placement and cluster reconfiguration algorithm for mapreduce framework. Future Gener Comput Syst 28(1):119–127
Article Google Scholar
Chen Y, Alspaugh S, Borthakur D, et al (2012) Energy efficiency for large-scale mapreduce workloads with significant interactive analysis. In: Proceedings of the 7th ACM European conference on computer systems
Palanisamy B, Singh A, Liu L, Jain B (2011) Purlieus: locality-aware resource allocation for mapreduce in a cloud. In: Proceedings of 2011 international conference for high performance computing, networking, storage and analysis
Chen L, Zhang J, Cai L et al (2017) Fast community detection based on distance dynamics. Tsinghua Sci Technol 22(6):564–585
Article Google Scholar
Tang Z, Jiang L, Zhou J, Li K, Li K (2015) A self-adaptive scheduling algorithm for reduce start time. Future Gener Comput Syst 43:51–60
Article Google Scholar
Ramanathan R, Latha B (2018) Towards optimal resource provisioning for Hadoop-MapReduce jobs using scale-out strategy and its performance analysis in private cloud environment. Clust Comput 2:1–11
Google Scholar
Lin JW, Arul JM, Lin CY (2018) Joint deadline-constrained and influence-aware design for allocating MapReduce jobs in cloud computing systems. Clust Comput 1:1–14
Google Scholar
Zhu Y, Jiang Y, Wu W, Ding L, Teredesai A, Li D, Lee W (2014) Minimizing makespan and total completion time in mapreduce-like systems. In: 2014 proceedings on INFOCOM. IEEE
Palanisamy B, Singh A, Liu L (2015) Cost-effective resource provisioning for mapreduce in a cloud. IEEE Trans Parallel Distrib Syst 26(5):1265–1279
Article Google Scholar
Lin M, Zhang L, Wierman A, Tan J (2013) Joint optimization of overlapping phases in mapreduce. Perform Eval 70(10):720–735
Article Google Scholar
Heintz B, Chandra A, Weissman J (2014) Cross-phase optimization in mapreduce. In: Cloud computing for data-intensive applications
Anjos JC, Carrera I, Kolberg W, Tibola AL, Arantes LB, Geyer CR (2015) Mar++: scheduling and data placement on mapreduce for heterogeneous environments. Future Gener Comput Syst 42:22–35
Article Google Scholar
Jin H, Yang X, Sun X-H, Raicu I (2012) Adapt: availability-aware mapreduce data placement for non-dedicated distributed computing. In: 2012 IEEE 32nd international conference on distributed computing systems (ICDCS). IEEE
Xie J, Yin S, Ruan X, Ding Z, Tian Y, Majors J, Manzanares A, Qin X (2010) Improving mapreduce performance through data placement in heterogeneous hadoop clusters. In: 2010 IEEE international symposium on parallel and distributed processing, workshops and Ph.D. forum (IPDPSW). IEEE
Al-Khasawneh MA, Shamsuddin SM, Hasan S et al (2018) MapReduce a comprehensive review. In: 2018 international conference on smart computing and electronic enterprise (ICSCEE) on IEEE
Gregory A, Majumdar S (2018) Resource management for deadline constrained MapReduce jobs for minimising energy consumption. Int J Big Data Intell 5(4):270–287
Article Google Scholar
Elzein NM, Majid MA, Hashem IAT et al (2018) Managing big RDF data in clouds: challenges, opportunities, and solutions. Sustain Cities Soc 39:375–386
Article Google Scholar
Chen L, Zhang J, Cai L et al (2016) Locality-aware and energy-aware job pre-assignment for mapreduce. In: International conference on intelligent networking and collaborative systems

Download references

Acknowledgements

This work was supported by the Science Research Project of Education Department of Hunan Province (18C0296); the Open Project of State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body (31715010); Hunan Provincial Natural Science Foundation of China (2018JJ2134); Hunan Provincial Young Talents Project (2018RS3095); and Ph.D. research startup foundation of Hunan University of Science and Technology (E51863).

Author information

Authors and Affiliations

School of Information and Electrical Engineering, Hunan University of Science and Technology, Xiangtan, China
Lei Chen
State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China
Zhao-Hua Liu

Authors

Lei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhao-Hua Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, L., Liu, ZH. Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter. SOCA 13, 297–308 (2019). https://doi.org/10.1007/s11761-019-00273-x

Download citation

Received: 12 July 2019
Revised: 05 August 2019
Accepted: 13 August 2019
Published: 22 August 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11761-019-00273-x

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) Strategy

Energy Efficiency MapReduce Job Scheduling of Shuffle and Reduce Phases in Data Center

HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Energy- and locality-efficient multi-job scheduling based on MapReduce for heterogeneous datacenter

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) Strategy

Energy Efficiency MapReduce Job Scheduling of Shuffle and Reduce Phases in Data Center

HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now