[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/ICDCS.2013.31guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

HybridMR: A Hierarchical MapReduce Scheduler for Hybrid Data Centers

Published: 08 July 2013 Publication History

Abstract

Virtualized environments are attractive because they simplify cluster management, while facilitating cost-effective workload consolidation. As a result, virtual machines in public clouds or private data centers, have become the norm for running transactional applications like web services and virtual desktops. On the other hand, batch workloads like MapReduce, are typically deployed in a native cluster to avoid the performance overheads of virtualization. While both these virtual and native environments have their own strengths and weaknesses, we demonstrate in this work that it is feasible to provide the best of these two computing paradigms in a hybrid platform. In this paper, we make a case for a hybrid data center consisting of native and virtual environments, and propose a 2-phase hierarchical scheduler, called HybridMR, for the effective resource management of interactive and batch workloads. In the first phase, HybridMR classifies incoming MapReduce jobs based on the expected virtualization overheads, and uses this information to automatically guide placement between physical and virtual machines. In the second phase, HybridMR manages the run-time performance of MapReduce jobs collocated with interactive applications in order to provide best effort delivery to batch jobs, while complying with the Service Level Agreements (SLAs) of interactive applications. By consolidating batch jobs with over-provisioned foreground applications, the available unused resources are better utilized, resulting in improved application performance and energy efficiency. Evaluations on a hybrid cluster consisting of 24 physical servers and 48 virtual machines, with diverse workload mix of interactive and batch MapReduce applications, demonstrate that HybridMR can achieve up to 40% improvement in the completion times of MapReduce jobs, over the virtual-only case, while complying with the SLAs of interactive applications. Compared to the native-only cluster, at the cost of minimal performance penalty, HybridMR boosts resource utilization by 45%, and achieves up to 43% energy savings. These results indicate that a hybrid data center with an efficient scheduling mechanism can provide a cost-effective solution for hosting both batch and interactive workloads.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICDCS '13: Proceedings of the 2013 IEEE 33rd International Conference on Distributed Computing Systems
July 2013
623 pages
ISBN:9780769550008

Publisher

IEEE Computer Society

United States

Publication History

Published: 08 July 2013

Author Tags

  1. Energy
  2. Hadoop MapReduce
  3. Hybrid Data Center
  4. Performance
  5. Resource Management
  6. Scheduling
  7. Virtualization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Multi-prediction based scheduling for hybrid workloads in the cloud data centerCluster Computing10.5555/3287988.328800821:3(1607-1622)Online publication date: 1-Sep-2018
  • (2018)rTunerProceedings of the 10th International Conference on Computer Modeling and Simulation10.1145/3177457.3191710(176-183)Online publication date: 8-Jan-2018
  • (2017)Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task TuningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.259476528:3(774-786)Online publication date: 1-Mar-2017
  • (2017)Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduceProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.96(181-185)Online publication date: 14-May-2017
  • (2016)History-based harvesting of spare cycles and storage in large-scale datacentersProceedings of the 12th USENIX conference on Operating Systems Design and Implementation10.5555/3026877.3026935(755-770)Online publication date: 2-Nov-2016
  • (2016)A scalable Map Reduce tasks schedulingInternational Journal of Computational Science and Engineering10.1504/IJCSE.2017.08117514:1(44-54)Online publication date: 1-Jan-2016
  • (2016)On exploiting data locality for iterative mapreduce applications in hybrid cloudsProceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies10.1145/3006299.3006329(118-122)Online publication date: 6-Dec-2016
  • (2015)Enabling big data analytics in the hybrid cloud using iterative mapreduceProceedings of the 8th International Conference on Utility and Cloud Computing10.5555/3233397.3233443(290-299)Online publication date: 7-Dec-2015
  • (2015)Minimizing Interference and Maximizing Progress for Hadoop Virtual MachinesACM SIGMETRICS Performance Evaluation Review10.1145/2788402.278841142:4(62-71)Online publication date: 2-Jun-2015
  • (2015)On Energyaware Allocation and Execution for Batch and Interactive MapReduceACM SIGMETRICS Performance Evaluation Review10.1145/2788402.278840742:4(22-30)Online publication date: 2-Jun-2015
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media