[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Improving MapReduce Performance Using Smart Speculative Execution Strategy

Published: 01 April 2014 Publication History

Abstract

MapReduce is a widely used parallel computing framework for large scale data processing. The two major performance metrics in MapReduce are job execution time and cluster throughput. They can be seriously impacted by straggler machines—machines on which tasks take an unusually long time to finish. Speculative execution is a common approach for dealing with the straggler problem by simply backing up those slow running tasks on alternative machines. Multiple speculative execution strategies have been proposed, but they have some pitfalls: i) Use average progress rate to identify slow tasks while in reality the progress rate can be unstable and misleading, ii) Cannot appropriately handle the situation when there exists data skew among the tasks, iii) Do not consider whether backup tasks can finish earlier when choosing backup worker nodes. In this paper, we first present a detailed analysis of scenarios where existing strategies cannot work well. Then we develop a new strategy, maximum cost performance (MCP), which improves the effectiveness of speculative execution significantly. To accurately and promptly identify stragglers, we provide the following methods in MCP: i) Use both the progress rate and the process bandwidth within a phase to select slow tasks, ii) Use exponentially weighted moving average (EWMA) to predict process speed and calculate a task’s remaining time, iii) Determine which task to backup based on the load of a cluster using a cost-benefit model. To choose proper worker nodes for backup tasks, we take both data locality and data skew into consideration. We evaluate MCP in a cluster of 101 virtual machines running a variety of applications on 30 physical servers. Experiment results show that MCP can run jobs up to 39 percent faster and improve the cluster throughput by up to 44 percent compared to Hadoop-0.21.

Cited By

View all
  • (2023)A comprehensive study and review of tuning the performance on database scalability in big data analyticsJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-22329544:3(5231-5255)Online publication date: 1-Jan-2023
  • (2023)Dynamic Straggler Mitigation for Large-Scale Spatial SimulationsACM Transactions on Spatial Algorithms and Systems10.1145/35789339:2(1-34)Online publication date: 12-Apr-2023
  • (2022)Internet Rumor Audience Response Prediction Algorithm Based on Machine Learning in Big Data EnvironmentWireless Communications & Mobile Computing10.1155/2022/36326792022Online publication date: 1-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 63, Issue 4
April 2014
264 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 April 2014

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A comprehensive study and review of tuning the performance on database scalability in big data analyticsJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-22329544:3(5231-5255)Online publication date: 1-Jan-2023
  • (2023)Dynamic Straggler Mitigation for Large-Scale Spatial SimulationsACM Transactions on Spatial Algorithms and Systems10.1145/35789339:2(1-34)Online publication date: 12-Apr-2023
  • (2022)Internet Rumor Audience Response Prediction Algorithm Based on Machine Learning in Big Data EnvironmentWireless Communications & Mobile Computing10.1155/2022/36326792022Online publication date: 1-Jan-2022
  • (2022)MapReduce: an infrastructure review and research insightsThe Journal of Supercomputing10.1007/s11227-019-02907-575:10(6934-7002)Online publication date: 11-Mar-2022
  • (2022)Reinforcement Learning-Based Task Scheduling Algorithm for On-Satellite Data AnalysisBig Data Intelligence and Computing10.1007/978-981-99-2233-8_28(398-409)Online publication date: 8-Dec-2022
  • (2021)FangornProceedings of the VLDB Endowment10.14778/3476311.347637614:12(2972-2985)Online publication date: 1-Jul-2021
  • (2020)A parallel two-stage genetic algorithm for route planningProceedings of the 2020 Genetic and Evolutionary Computation Conference Companion10.1145/3377929.3398116(1739-1746)Online publication date: 8-Jul-2020
  • (2020)Designing a MapReduce performance model in distributed heterogeneous platforms based on benchmarking approachThe Journal of Supercomputing10.1007/s11227-020-03162-976:9(7177-7203)Online publication date: 1-Sep-2020
  • (2020)Detecting straggler MapReduce tasks in big data processing infrastructure by neural networkThe Journal of Supercomputing10.1007/s11227-019-03136-676:9(6969-6993)Online publication date: 1-Sep-2020
  • (2019)Optimized Speculative Execution Strategy for Different Workload Levels in Heterogeneous Spark ClusterProceedings of the 4th International Conference on Big Data and Computing10.1145/3335484.3335493(6-10)Online publication date: 10-May-2019
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media