[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

A probabilistic scheduling heuristic for computational grids

Published: 01 January 2006 Publication History

Abstract

Computational grids are large scale distributed networks of peer clusters of computing resources bounded by a decentralized management framework for the purpose of providing computing services, called grid services. The scheduling problem consists in finding the clusters that host the required set of grid services with a sufficient available capacity to handle user service requests in compliance with some specified quality of service. The interplay of intermittent resource participation, resource load dynamics, network latency and processing delay, and random subsystem failures creates a ubiquitous uncertainty on the state of the grid capacity to handle user requests. In addition to the need to account for this uncertainty, the scheduling strategy has to be decentralized since computational grids span distinct management domains. In this paper, we propose a decentralized scheduling strategy that views the dynamics of the grid service capacity as a stochastic process modeled by a Markov chain. The proposed scheduling scheme uses this model to predict the future local availability of resources. This is consolidated by a confidence model that approximates the future ability of peer clusters to successfully handle delegated service requests. The scalability of the proposed scheduling strategy is illustrated through simulation.

References

[1]
{1} D. Abramson, R. Buyya and J. Giddy, A computational economy for grid computing and its implementation in the Nimrod-G resource broker, Future Generation Computer Systems <b>18</b> (2002), 1061-1074.]]
[2]
{2} I. Ahmad and Y.-K. Kwok, On parallelizing the multiprocessor scheduling problem, IEEE Transactions on Parallel and Distributed Systems <b>10</b> (1999), 414-432.]]
[3]
{3} I. Ahmad and Y.-K. Kwok, Parallel Program Scheduling Techniques, in: High Performance Cluster Computing, (Vol. 1), Prentice Hall PTR, New Jersey, 1999, pp. 553-578.]]
[4]
{4} M.A. Al-Mouhamed, Lower bound on the number of processors and time for scheduling precedence graphs with communication costs, IEEE Transactions on Software Engineering <b>16</b> (1990), 1317-1322.]]
[5]
{5} F. Berman, H. Casanova, A. Chien, K. Cooper, H. Dall, A. Dasgupta, W. Deng, J. Dongarra, L. Johnsson, K. Kennedy, C. Koelbel, B. Liu, X. Liu, A. Mandal, G. Matin, M. Mazina, J. Mellor-Crammey, C. Mendes, A. Olugbile, M. Patel, D. Reed, Z. Shi, O. Sievert, H. Xia and A. Yarkhan, New grid scheduling and rescheduling methods in the GRADS project, International Journal of Parallel Programming <b>33</b> (2005), 209-229.]]
[6]
{6} F. Berman, R. Wolski, H. Casanova, W. Cirne, H. Dail, M. Faerman, S. Figueira, J. Hayes, G. Obertelli, J. Schopf, G. Shao, S. Smallen, N. Spring, A. Su and D. Zagorodnov, Adaptive computing on the grid using AppLeS, IEEE Transactions on Parallel and Distributed Systems <b>14</b> (2003), 369-382.]]
[7]
{7} T.D. Braun, H.J. Siegel, N. Beck, L. Boloni, M. Maheswaran, A.I. Reuther, J.P. Robertson, M.D. Theys and B. Yao, Taxonomy for describing matching and scheduling heuristics for mixed-machine heterogeneous computing systems, in: Proceedings of the IEEE Symposium on Reliable Distributed Systems , 1998, pp. 330-335.]]
[8]
{8} T.D. Braun, H.J. Siegel, N. Beck, L.L. Boloni, M. Maheswaran, A.I. Reuther, J.P. Robertson, M.D. Theys, B. Yao, D. Hensgen and R.F. Freund, A Comparison of Eleven Static Heuristics for Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing Systems, Journal of Parallel and Distributed Computing <b>61</b> (2001), 810-837.]]
[9]
{9} R. Buyya, High performance cluster computing, Prentice Hall PTR, Upper Saddle River, N.J., 1999.]]
[10]
{10} R. Buyya, D. Abramson, J. Giddy and H. Stockinger, Economic models for resource management and scheduling in grid computing, Concurrency Computation Practice and Experience <b>14</b> (2002), 1507-1542.]]
[11]
{11} R. Buyya, D. Abramson and S. Venugopal, The grid economy, Proceedings of the IEEE <b>93</b> (2005), 698-714.]]
[12]
{12} R. Buyya, K. Branson, J. Giddy and D. Abramson, The Virtual Laboratory: A toolset to enable distributed molecular modelling for drug design on the world-wide grid, Concurrency Computation Practice and Experience <b>15</b> (2003), 1-25.]]
[13]
{13} J. Cao, S.A. Jarvis, S. Saini, D.J. Kerbyson and G.R. Nudd, ARMS: An agent-based resource management system for grid computing, Scientific Programming <b>10</b> (2002), 135-148.]]
[14]
{14} J. Cao, D.P. Spooner, S.A. Jarvis and G.R. Nudd, Grid load balancing using intelligent agents, Future Generation Computer Systems <b>21</b> (2005), 135-149.]]
[15]
{15} H. Casanova, T.M. Bartol, J. Stiles and F. Berman, Distributing MCell simulations on the grid, International Journal of High Performance Computing Applications <b>15</b> (2001), 243-257.]]
[16]
{16} H. Casanova and J. Dongarra, NetSolve: A network-enabled server for solving computational science problems, International Journal of Supercomputer Applications and High Performance Computing <b>11</b> (1997), 212-223.]]
[17]
{17} T.L. Casavant and J.G. Kuhl, A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems, IEEE Transactions on Software Engineering <b>14</b> (1988), 141-155.]]
[18]
{18} S.J. Chapin, D. Katramatos, J. Karpovich and A. Grimshaw, Resource management in Legion, Future Generation Computer Systems <b>15</b> (1999), 583-594.]]
[19]
{19} L. Chunlin and L. Layuan, A distributed utility-based two level market solution for optimal resource scheduling in computational grid, Parallel Computing <b>31</b> (2005), 332-351.]]
[20]
{20} M. Cosnard and M. Loi, Automatic Task Graph Generation Techniques, Parallel Processing Letters <b>54</b> (1995), 527-538.]]
[21]
{21} K. Czajkowski, S. Fitzgerald, I. Foster and C. Kesselman, Grid information services for distributed resource sharing, in: IEEE International Symposium on High Performance Distributed Computing, Proceedings, 2001, pp. 181-194.]]
[22]
{22} K. Czajkowski, I. Foster and C. Kesselman, Agreement-based resource management, Proceedings of the IEEE <b>93</b> (2005), 631-643.]]
[23]
{23} Y. Derbal, Service-oriented model of grid resource availability, International Journal of Computers and Applications (2005), In review.]]
[24]
{24} Y. Derbal, Service oriented grid resource modeling and management, in: Proceedings of the 1st International Conference on Web Information Systems and Technologies (WEBIST 2005), 2005, pp. 146-153.]]
[25]
{25} V. Di Martino and M. Mililotti, Sub optimal scheduling in a grid using genetic algorithms, Parallel Computing <b>30</b> (2004), 553-565.]]
[26]
{26} P.A. Dinda, Online prediction of the running time of tasks, in: IEEE International Symposium on High Performance Distributed Computing, Proceedings, 2001, pp. 383-394.]]
[27]
{27} H. El-Rewini, T.G. Lewis and H.H. Ali, Task scheduling in parallel and distributed systems, Prentice Hall, Englewood Cliffs, N.J., 1994.]]
[28]
{28} M. Faerman, A. Birnbaum, F. Berman and H. Casanova, Resource allocation strategies for guided parameter space searches, International Journal of High Performance Computing Applications <b>17</b> (2003), 383-402.]]
[29]
{29} I. Foster and C. Kesselman, The grid: blueprint for a new computing infrastructure, Elsevier Science, San Francisco, 2004.]]
[30]
{30} Y. Gao, H. Rong and J.Z. Huang, Adaptive grid job scheduling with genetic algorithms, Future Generation Computer Systems <b>21</b> (2005), 151-161.]]
[31]
{31} R. Gary and D. Johnson, Computers and Intractability - A Guide to the Theory of NP-Completeness, Freeman, San Francisco, 1979.]]
[32]
{32} A. Goldman and C. Queiroz, A model for parallel job scheduling on dynamical computer Grids, Concurrency Computation Practice and Experience <b>16</b> (2004), 461-468.]]
[33]
{33} S. Graupner, V. Kotov, A. Andrzejak and H. Trinks, Service-centric globally distributed computing, IEEE Internet Computing <b>7</b>(2003), 36-43.]]
[34]
{34} J. Gustafson, Program of grand challenge problems: expectations and results, in: Aizu International Symposium on Parallel Algorithms/Architecture Synthesis, 1997, pp. 2-7.]]
[35]
{35} L. He, S.A. Jarvis, D.P. Spooner, X. Chen and G.R. Nudd, Hybrid performance-based workload management for multiclusters and grids, IEE Proceedings: Software <b>151</b> (2004), 224-231.]]
[36]
{36} X. He, X. Sun and G. Von Laszewski, QoS guided Min-Min heuristic for grid task scheduling, Journal of Computer Science and Technology <b>18</b> (2003), 442-451.]]
[37]
{37} R.L. Henderson, Job scheduling under the Portable Batch System, in: Lecture Notes in Computer Science, (Vol. 949), Springer Verlag, Heidelberg, D-69121, Germany, 1995, pp. 279-294.]]
[38]
{38} E. Huedo, R.S. Montero and I.M. Llorente, Experiences on adaptive grid scheduling of parameter sweep applications, in: Proceedings of the 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing, Institute of Electrical and Electronics Engineers Inc., 2004, pp. 28-33.]]
[39]
{39} E. Huedo, R.S. Montero and I.M. Llorente, A framework for adaptive execution in grids, Software - Practice and Experience <b>34</b> (2004), 631-651.]]
[40]
{40} J.-J. Hwang, Y.-C. Chow, F.D. Anger and C.-Y. Lee, Scheduling precedence graphs in systems with interprocessor communication times, SIAM Journal on Computing <b>18</b> (1989), 244-257.]]
[41]
{41} D. Jackson, Q. Snell and M. Clement, Core algorithms of the Maul scheduler, in: Lecture Notes Computer Science, Springer, Berlin, 2001, pp. 87-102.]]
[42]
{42} N. Kapadia and J. Fortes, PUNCH: An architecture for Web-enabled wide-area network-computing, Cluster Computing: The Journal of Networks, Software Tools and Applications <b>2</b> (1999), 153-164.]]
[43]
{43} K. Krauter, R. Buyya and M. Maheswaran, A taxonomy and survey of grid resource management systems for distributed computing, Software - Practice and Experience <b>32</b> (2002), 135-164.]]
[44]
{44} N. Krothapalli and A.V. Deshmukh, Dynamic allocation of communicating tasks in computational grids, IIE Transactions (Institute of Industrial Engineers) <b>36</b> (2004), 1037-4053.]]
[45]
{45} Y.-K. Kwok and I. Ahmad, Benchmarking the task graph scheduling algorithms, in: Proceedings of the International Parallel Processing Symposium, IPPS, 1998, pp. 531-537.]]
[46]
{46} Y.-K. Kwok and I. Ahmad, Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors, IEEE Transactions on Parallel and Distributed Systems <b>7</b>(1996), 506-521.]]
[47]
{47} Y.-K. Kwok and I. Ahmad, Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors, ACM Computing Surveys <b>31</b> (1999), 406-471.]]
[48]
{48} D. Lifka, The ANL/IBM SP scheduling system, in: Lecture Notes Computer Science, (Vol. 2221), Springer, Berlin, 2001, pp. 187-191.]]
[49]
{49} M.J. Litzkow, M. Livny and M.W. Matka, Condor- a hunter of idle workstations, in: Proceedings - International Conference on Distributed Computing Systems, 1988, pp. 104-111.]]
[50]
{50} H. Nakada, M. Sato and S. Sekiguchi, Design and implementations of Ninf: towards a global computing infrastructure, Future Generation Computer Systems <b>15</b> (1999), 649-658.]]
[51]
{51} A.L. Rosenberg and M. Yurkewych, Guidelines for scheduling some common computation-dags for internet-based computing, IEEE Transactions on Computers <b>54</b> (2005), 428-438.]]
[52]
{52} S.M. Ross, Introduction to Probability Models, Academic Press, Inc., San Diego, CA, 1989.]]
[53]
{53} M.P. Singh and M.N. Huhns, Service-Oriented Computing: Semantics, Processes, Agents, John Wiley & Sons, 2005.]]
[54]
{54} D.P. Spooner, S.A. Jarvis, J. Cao, S. Salni and G.R. Nudd, Local grid scheduling techniques using performance prediction, IEE Proceedings: Computers and Digital Techniques <b>150</b> (2003), 87-96.]]
[55]
{55} X.-H. Sun and M. Wu, Grid Harvest Service: a system for long-term, application-level task scheduling, in: Parallel and Distributed Processing Symposium, 2003, pp. 25-33.]]
[56]
{56} E. Tantoso, H.A. Wahab and H.Y. Chan, Molecular docking: An example of grid enabled applications, New Generation Computing <b>22</b> (2004), 189-190.]]
[57]
{57} C. Weng and X. Lu, Heuristic scheduling for bag-of-tasks applications in combination with QoS in the computational grid, Future Generation Computer Systems <b>21</b> (2005), 271-280.]]
[58]
{58} L. Yang, I. Foster and J.M. Schopf, Homeostatic and Tendency-based CPU Load Predictions, in: Procedeeings of the Parallel and Distributed Processing Symposium, 2003, pp. 42-51.]]
[59]
{59} L. Yang, J.M. Schopf and I. Foster, Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments, in: Proceedings of the 2003 ACM/IEEE conference on Supercomputing, 2003, pp. 31-47.]]
[60]
{60} J. Yi, C. Choi, K. Park and S. Kim, Efficient dynamic resource reallocation scheme using time-slot connection pattern, in: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, Association for Computing Machinery, New York, United States, 2003, pp. 677-682.]]
[61]
{61} S. Zhou, X. Zheng, J. Wang and P. Delisle, Utopia: A load sharing facility for large, heterogeneous distributed computer systems, Software -Practice and Experience <b>23</b> (1993), 224-234.]]

Cited By

View all
  • (2018)A model of grid service capacityJournal of Computer Science and Technology10.1007/s11390-007-9074-y22:4(505-514)Online publication date: 21-Dec-2018
  • (2018)A probabilistic and adaptive scheduling algorithm using system-generated predictions for inter-grid resource sharingThe Journal of Supercomputing10.1007/s11227-007-0169-645:2(185-204)Online publication date: 30-Dec-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Multiagent and Grid Systems
Multiagent and Grid Systems  Volume 2, Issue 1
January 2006
84 pages

Publisher

IOS Press

Netherlands

Publication History

Published: 01 January 2006

Author Tags

  1. Markov chain
  2. computational grid
  3. decentralized scheduling
  4. probabilistic scheduling
  5. service capacity

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)A model of grid service capacityJournal of Computer Science and Technology10.1007/s11390-007-9074-y22:4(505-514)Online publication date: 21-Dec-2018
  • (2018)A probabilistic and adaptive scheduling algorithm using system-generated predictions for inter-grid resource sharingThe Journal of Supercomputing10.1007/s11227-007-0169-645:2(185-204)Online publication date: 30-Dec-2018

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media