[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud

Published: 01 June 2011 Publication History

Abstract

In recent years ad hoc parallel data processing has emerged to be one of the killer applications for Infrastructure-as-a-Service (IaaS) clouds. Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for customers to access these services and to deploy their programs. However, the processing frameworks which are currently used have been designed for static, homogeneous cluster setups and disregard the particular nature of a cloud. Consequently, the allocated compute resources may be inadequate for big parts of the submitted job and unnecessarily increase processing time and cost. In this paper, we discuss the opportunities and challenges for efficient parallel data processing in clouds and present our research project Nephele. Nephele is the first data processing framework to explicitly exploit the dynamic resource allocation offered by today's IaaS clouds for both, task scheduling and execution. Particular tasks of a processing job can be assigned to different types of virtual machines which are automatically instantiated and terminated during the job execution. Based on this new framework, we perform extended evaluations of MapReduce-inspired processing jobs on an IaaS cloud system and compare the results to the popular data processing framework Hadoop.

Cited By

View all
  • (2021)Autonomous resource management in distributed stream processing systemsProceedings of the 22nd International Middleware Conference: Doctoral Symposium10.1145/3491087.3493680(19-22)Online publication date: 6-Dec-2021
  • (2020)Resource Management and Scheduling in Distributed Stream Processing SystemsACM Computing Surveys10.1145/335539953:3(1-41)Online publication date: 28-May-2020
  • (2019)A Comprehensive Survey on Parallelization and Elasticity in Stream ProcessingACM Computing Surveys10.1145/330384952:2(1-37)Online publication date: 30-Apr-2019
  • Show More Cited By
  1. Exploiting Dynamic Resource Allocation for Efficient Parallel Data Processing in the Cloud

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Parallel and Distributed Systems
    IEEE Transactions on Parallel and Distributed Systems  Volume 22, Issue 6
    June 2011
    174 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 June 2011

    Author Tags

    1. Many-task computing
    2. cloud computing.
    3. high-throughput computing
    4. loosely coupled applications

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Autonomous resource management in distributed stream processing systemsProceedings of the 22nd International Middleware Conference: Doctoral Symposium10.1145/3491087.3493680(19-22)Online publication date: 6-Dec-2021
    • (2020)Resource Management and Scheduling in Distributed Stream Processing SystemsACM Computing Surveys10.1145/335539953:3(1-41)Online publication date: 28-May-2020
    • (2019)A Comprehensive Survey on Parallelization and Elasticity in Stream ProcessingACM Computing Surveys10.1145/330384952:2(1-37)Online publication date: 30-Apr-2019
    • (2019)Energy-Aware Fault-Tolerant Dynamic Task Scheduling Scheme for Virtualized Cloud Data CentersMobile Networks and Applications10.1007/s11036-018-1062-724:3(1063-1077)Online publication date: 1-Jun-2019
    • (2019)DistriPlan: an optimized join execution framework for geo-distributed scientific dataDistributed and Parallel Databases10.1007/s10619-019-07264-z38:1(127-152)Online publication date: 23-Mar-2019
    • (2018)Preemptive cloud resource allocation modeling of processing jobsThe Journal of Supercomputing10.5555/3211601.321166774:5(2116-2150)Online publication date: 1-May-2018
    • (2018)Energy efficient workflow scheduling with virtual machine consolidation for green cloud computingJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-16945134:3(1561-1572)Online publication date: 1-Jan-2018
    • (2018)Implementing a Multi-layer Job Scheduling Approach with Effective Load Balancing and Energy Saving over a CloudProceedings of the 2018 10th International Conference on Information Management and Engineering10.1145/3285957.3285981(55-58)Online publication date: 22-Sep-2018
    • (2018)A Model Predictive Controller for Managing QoS Enforcements and Microarchitecture-Level Interferences in a Lambda PlatformIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.277950229:7(1442-1455)Online publication date: 1-Jul-2018
    • (2017)User's priority focused resource provisioning over cloud computing infrastructureInternational Journal of Grid and Utility Computing10.1504/IJGUC.2017.0882818:4(357-364)Online publication date: 1-Jan-2017
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media