[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Online execution time prediction for computationally intensive applications with periodic progress updates

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The effectiveness of distributed execution of computationally intensive applications (jobs) largely depends on the quality of the applied scheduling approach. However, most of the existing non-trivial scheduling algorithms rely on prior knowledge or on prediction of application parameters, such as execution time, size of input and output, dependencies, etc., to assign applications to the available computational resources. A major issue is that these parameters are hard to determine in advance, especially if the end user does not possess an extensive history of previous application runs.

In this work we propose an online method for execution time prediction of applications, for which execution progress can be collected at run-time. Using dynamic progress information, the total job execution time can be predicted using extrapolation. However, the predictions achieved by extrapolation are far from precise and often vary over time as a result of changing application dynamics and varying resource load. Therefore, to compute the actual job execution time we match a number of predefined prediction evolution models against the consecutive extrapolations, by adopting nonlinear curve-fitting. The “best-fit” coefficients allow for more accurate execution time prediction.

The predictions made are used to enhance a dynamic scheduling algorithm for workflows introduced in our earlier work. The scheduling algorithm is run with and without curve-fitting, showing a performance improvement of up to 15% in the former case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Casanova H, Marchal L (2002) A network model for simulation of grid application. Tech rep, École Normale Supérieure de Lyon, Laboratoire de l’Informatique du Parallélisme

  2. Chtepen M, Claeys F, Dhoedt B, De Turck F, Demeester P, Vanrolleghem P (2009) Scheduling of dynamic workflows in grids. IEEE Trans Parallel Distrib Syst 20(2):180–190

    Article  Google Scholar 

  3. Chtepen M, Claeys F, Dhoedt B, De Turck F, Vanrolleghem P, Demeester P (2006) Dynamic scheduling of computationally intensive applications on unreliable infrastructures. In: Proceedings of the 2nd European modeling and simulation symposium (EMSS ’06), Barcelona, Spain

    Google Scholar 

  4. Claeys F (2008) A generic software framework for modelling and virtual experimentation with complex environmental systems. Phd thesis, Dept of Applied Mathematics and Process Control, Ghent University, Belgium

  5. Claeys F, De Pauw D, Benedetti L, Nopens I, Vanrolleghem P (2006) Tornado: a versatile and efficient modelling & virtual experimentation kernel for water quality systems applications. In: Proceedings of the international environmental modelling and software conference (iEMSs), Burlington, Vermont, USA

    Google Scholar 

  6. Consortium, M.: Modelica website. http://www.modelica.org/

  7. Gropp W, Lusk E, Doss N, Skjellum A (1996) High-performance, portable implementation of the MPI message passing interface standard. Parallel Comput 22(6):789–828

    Article  MATH  Google Scholar 

  8. Iverson M, Ozguner F, Follen G (1996) Run-time statistical estimation of task execution times for heterogeneous distributed computing. In: Proceedings of the 5th IEEE international symposium on high performance distributed computing (HPDC-5 ’96), Syracuse, New York

    Google Scholar 

  9. Iverson M, Ozguner F, Potter L (1999) Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment. IEEE Trans Comput 48(12):1374–1379

    Article  Google Scholar 

  10. Lafreniere B, Sodan A (2005) ScoPred—scalable user-directed performance prediction using complexity modeling and historical data. In: Proceedings of the 11th workshop on job scheduling strategies for parallel processing, Cambridge, MA, pp 62–90

    Chapter  Google Scholar 

  11. Lee BD, Schopf J (2003) Run-time prediction of parallel applications on shared environments. In: Proceedings of the 5th IEEE international conference on cluster computing (CLUSTER’03), Hong Kong

    Google Scholar 

  12. Lublin U, Feitelson D (2003) The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J Parallel Distrib Comput 63(11):1105–1122

    Article  MATH  Google Scholar 

  13. Mandal A, Kennedy K, Koelbel C, Marin G, Mellor-Crummey J, Liu B, Johnsson L (2005) Scheduling strategies for mapping application workflows onto the grid. In: Proceedings of the 14th IEEE international symposium on high performance distributed computing, Research Triangle Park, NC, USA, pp 125–134

    Chapter  Google Scholar 

  14. Nadeem F, Fahringer T (2009) Using templates to predict execution time of scientific workflow applications in the grid. In: Proceedings of the 9th IEEE/ACM international symposium on cluster computing and the grid, Shanghai, China, pp 316–323

    Chapter  Google Scholar 

  15. Nassif L, Nogueira J, Ahmed M, Karmouch A, Impey R, Vinicius de Andrade F (2005) Job completion prediction in grid using distributed case-based reasoning. In: Proceedings of the 14th IEEE international workshops on enabling technologies: infrastructure for collaborative enterprise (WETICE), Linkoping, Sweden

    Google Scholar 

  16. Press W, Teukolsky S, Vettering W, Flannery BP (1992) Numerical recipes in C, Chap 10, 2nd edn. Cambridge University Press, Cambridge, pp 408–412

    Google Scholar 

  17. Sadjadi S, Shimizu S, Figueroa J, Rangaswami R, Delgado J, Duran H, Collazo X (2008) A modeling approach for estimating execution time of long-running scientific applications. In: Proceedings of the 22nd IEEE international parallel and distributed processing symposium (IPDPS), the fifth high-performance grid computing workshop (HPGC-2008), Miami, FL, USA

    Google Scholar 

  18. UGent: BeGrid UGent. http://begrid.atlantis.ugent.be/

  19. Vanhooren H, Meirlaen J, Amerlinck Y, Claeys F, Vangheluwe H, Vanrolleghem P (2003) WEST: modelling biological wastewater treatment. J Hydroinf 5(1):27–50

    Google Scholar 

  20. Xu CZ, Wang L, Fong NT (2002) Stochastic prediction of execution time for dynamic bulk synchronous computations. J Supercomput 21(1):91–103

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Chtepen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chtepen, M., Claeys, F.H.A., Dhoedt, B. et al. Online execution time prediction for computationally intensive applications with periodic progress updates. J Supercomput 62, 768–786 (2012). https://doi.org/10.1007/s11227-012-0748-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-012-0748-z

Keywords

Navigation