[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3297663.3310312acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
short-paper

Simulation Based Job Scheduling Optimization for Batch Workloads

Published: 04 April 2019 Publication History

Abstract

We present a simulation based approach for scheduling jobs that are part of a batch workflow. Our objective is to minimize the makespan, defined as completion time of the last job to leave the system in a batch workflow with dependencies. The existing job schedulers make scheduling decisions based on available cores, memory size, priority or execution time of jobs. This does not guarantee minimum makespan since contention for resources among concurrently running jobs are ignored. In our approach, prior to scheduling batch jobs on physical servers, we simulate the execution of jobs using a discrete event simulator. The simulator considers available cores and available memory bandwidth on distributed systems to accurately simulate the execution of jobs using resource contention models in a concurrent run. We also propose simulation based job scheduling algorithms that use underlying contention models and minimize the makespan by optimally mapping jobs onto the available nodes. Our approach ensures that job dependencies are adhered to during the simulation. We assess the efficacy of our job scheduling algorithms and contention models by performing experiments on a real cluster. Our experimental results show that simulation based approach improves the makespan by 15% to 35% depending on the nature of workload.

References

[1]
{n. d.}. IBM LSF. http:ls11-www.cs.tu-dortmund.de?people?hermesmanuals?LSF?users.pdf. Accessed: 2018--10--25.
[2]
{n. d.}. IBM Network-aware scheduling. https://www.ibm.com'support?knowledgecenter?en?SSETD4_9.1.3?lsf_admin?pe_network_aware_sched.html.Accessed: 2018--10--25.
[3]
S. Bardhan and D. A.Menasc. 2015. Predicting the Efect of Memory Contentionin Multi-Core Computers Using Analytic Performance Models.IEEE Trans.Comput.64, 8 (Aug 2015), 2279--2292.
[4]
S. Binato, W. J. Hery, D. M. Loewenstern, and M. G. C. Resende. 2002.A Grasp forJob Shop Scheduling. Springer US, Boston, MA, 59--79.
[5]
Rajkumar Buyya and Manzur Murshed. {n. d.}. GridSim: a toolkit for themodeling and simulation of distributed resource management and schedul-ing for Grid computing.Concurrency and Computation: Practice and Ex-perience14, 13-15 ({n. d.}), 1175--1220. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.710
[6]
Dheeraj Chahal and Benny Mathew. 2018. PROWL: Towards Predicting theRuntime of Batch Workloads. InCompanion of the 2018 ACM/SPEC InternationalConference on Performance Engineering, ICPE 2018, Berlin, Germany, April 09--13,2018. 199--200.
[7]
Dheeraj Chahal, Benny Mathew, and Manoj K. Nambiar. 2018. Predicting theRuntime of Memory Intensive Batch Workloads. InThe 47th International Con-ference on Parallel Processing, ICCP 2018, Workshop Proceedings, Eugene, OR, USA,August 13--16, 2018. 45:1--45:8.
[8]
Imran Ali Chaudhry and Abid Ali Khan. {n. d.}. A research survey: reviewof lexible job shop scheduling techniques.International Transactions in Op-erational Research23, 3 ({n. d.}), 551--591. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/itor.12199
[9]
Wei-neng Chen and Jun Zhang. 2009. An Ant Colony Optimization Approach toa Grid Worklow Scheduling Problem With Various QoS Requirements. 39 (012009), 29--43.
[10]
Andreas De Blanche and Thomas Lundqvist. 2014. A methodology for estimatingco-scheduling slowdowns due to memory bus contention on multicore nodes.InProceedings of the IASTED International Conference on Parallel and DistributedComputing and Networks, PDCN 2014 :. 216--223.
[11]
Andreas de Blanche and Thomas Lundqvist. 2015. Addressing characterizationmethods for memory contention aware co-scheduling.The Journal of Supercom-puting71, 4 (01 Apr 2015), 1451--1483.
[12]
Stephen Herbein, Dong H. Ahn, Don Lipari, Thomas R.W. Scogland, Marc Stear-man, Mark Grondona, Jim Garlick, Becky Springmeyer, and Michela Taufer. 2016.Scalable I/O-Aware Job Scheduling for Burst Bufer Enabled HPC Clusters. InProceedings of the 25th ACM International Symposium on High-Performance Par-allel and Distributed Computing (HPDC '16). ACM, New York, NY, USA, 69--80.
[13]
Hesam Izakian, Ajith Abraham, and Vaclav Snasel. 2009. Performance compari-son of six eicient pure heuristics for scheduling meta-tasks on heterogeneousdistributed environments.Neural Network World19, 6 (2009), 695.
[14]
Ajay Kattepur and Manoj Nambiar. 2015. Performance Modeling of Multi-tieredWeb Applications with Varying Service Demands.2015 IEEE International Paralleland Distributed Processing Symposium Workshop (IPDPSW)00 (2015), 415--424.
[15]
Maciej Malawski, Gideon Juve, Ewa Deelman, and Jarek Nabrzyski. 2012. Cost-and Deadline-constrained Provisioning for Scientiic Worklow Ensembles inIaaS Clouds. InProceedings of the International Conference on High PerformanceComputing, Networking, Storage and Analysis (SC '12). IEEE Computer SocietyPress, Los Alamitos, CA, USA, Article 22, 11 pages. http://dl.acm.org/citation.cfm?id=2388996.2389026
[16]
Benny Mathew and Dheeraj Chahal. 2017. DESiDE: Discrete Event SimulationDevelopers Environment. InProceedings of the 8th ACM/SPEC on InternationalConference on Performance Engineering (ICPE '17). ACM, New York, NY, USA,173--174.
[17]
John D. McCalpin. 1991--2007.STREAM: Sustainable Memory Bandwidth in HighPerformance Computers. Technical Report. University of Virginia, Charlottesville,Virginia. http://www.cs.virginia.edu/stream/ A continually updated technicalreport. http://www.cs.virginia.edu/stream/.
[18]
Maroua Nouiri, Abdelghani Bekrar, Abderezak Jemai, Smail Niar, and Ahmed Chi-heb Ammari. 2018. An efective and distributed particle swarm optimizationalgorithm for lexible job-shop scheduling problem.Journal of Intelligent Manu-facturing29, 3 (01 Mar 2018), 603--615.
[19]
I. Pietri, G. Juve, E. Deelman, and R. Sakellariou. 2014. A Performance Modelto Estimate Execution Time of Scientiic Worklows on the Cloud. In2014 9thWorkshop on Worklows in Support of Large-Scale Science. 11--19.
[20]
J. F. PÃrez, G. Casale, and S. Pacheco-Sanchez. 2015. Estimating ComputationalRequirements in Multi-Threaded Applications.IEEE Transactions on Software En-gineering41, 3 (March 2015), 264--278.
[21]
Ajay Ramaswamy, Nalini Kumar, Aravind Neelakantan, Herman Lam, and GregStitt. 2018. Scalable Behavioral Emulation of Extreme-Scale Systems UsingStructural Simulation Toolkit. InProceedings of the 47th International Conferenceon Parallel Processing (ICPP 2018). ACM, New York, NY, USA, Article 17, 11 pages.
[22]
Maria Alejandra Rodriguez and Rajkumar Buyya. 2014. Deadline Based ResourceProvisioningand Scheduling Algorithm for Scientiic Worklows on Clouds.IEEETransactions on Cloud Computing2 (2014), 222--235.
[23]
Jyoti Sahni and Deo Prakash Vidyarthi. 2018. A Cost-Efective Deadline-Constrained Dynamic Scheduling Algorithm for Scientiic Worklows in a CloudEnvironment.IEEE Transactions on Cloud Computing6 (2018), 2--18.
[24]
W. Wang, J. W. Davidson, and M. L. Sofa. 2016. Predicting the memory bandwidthand optimal core allocations for multi-threaded applications on large-scale NUMAmachines. In2016 IEEE International Symposium on High Performance ComputerArchitecture (HPCA). 419--431.
[25]
D. Xu, C. Wu, and P. Yew. 2010. On mitigating memory bandwidth contentionthrough bandwidth-aware scheduling. In2010 19th International Conference onParallel Architectures and Compilation Techniques (PACT). 237--247.
[26]
R. Yang, J. Antony, P. P. Janes, and A. P. Rendell. 2008. Memory and ThreadPlacement Efects as a Function of Cache Usage: A Study of the GaussianChemistry Code on the SunFire X4600 M2. In2008 International Symposiumon Parallel Architectures, Algorithms, and Networks (i-span 2008). 31--36.
[27]
Jia Yu and Rajkumar Buyya. 2006. Scheduling Scientiic Worklow Applicationswith Deadline and Budget Constraints Using Genetic Algorithms.Sci. Program.14, 3,4 (Dec. 2006), 217--230. http://dl.acm.org/citation.cfm?id=1376960.1376967
[28]
Wei Zheng and Rizos Sakellariou. 2013. Stochastic DAG scheduling using aMonte Carlo approach.J. Parallel and Distrib. Comput.73, 12 (2013), 1673--1689. Heterogeneity in Parallel andDistributed Computing.

Cited By

View all
  • (2023)Enhancing Task Scheduling in Cloud Computing: A Multi-Objective Cuckoo Search Algorithm Approach2023 3rd International Conference on Advancement in Electronics & Communication Engineering (AECE)10.1109/AECE59614.2023.10428205(869-874)Online publication date: 23-Nov-2023
  • (2023)Toward Building a Digital Twin of Job Scheduling and Power Management on an HPC SystemJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-22698-4_3(47-67)Online publication date: 12-Jan-2023
  • (2022)An Automated Task Scheduling Model Using Non-Dominated Sorting Genetic Algorithm II for Fog-Cloud SystemsIEEE Transactions on Cloud Computing10.1109/TCC.2020.303238610:4(2294-2308)Online publication date: 1-Oct-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPE '19: Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering
April 2019
348 pages
ISBN:9781450362399
DOI:10.1145/3297663
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 April 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. batch jobs
  2. job scheduling algorithms
  3. makespan
  4. service demand

Qualifiers

  • Short-paper

Conference

ICPE '19

Acceptance Rates

ICPE '19 Paper Acceptance Rate 13 of 71 submissions, 18%;
Overall Acceptance Rate 252 of 851 submissions, 30%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Enhancing Task Scheduling in Cloud Computing: A Multi-Objective Cuckoo Search Algorithm Approach2023 3rd International Conference on Advancement in Electronics & Communication Engineering (AECE)10.1109/AECE59614.2023.10428205(869-874)Online publication date: 23-Nov-2023
  • (2023)Toward Building a Digital Twin of Job Scheduling and Power Management on an HPC SystemJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-22698-4_3(47-67)Online publication date: 12-Jan-2023
  • (2022)An Automated Task Scheduling Model Using Non-Dominated Sorting Genetic Algorithm II for Fog-Cloud SystemsIEEE Transactions on Cloud Computing10.1109/TCC.2020.303238610:4(2294-2308)Online publication date: 1-Oct-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media