Abstract
Large-scale MPI programs must work with dynamic and heterogeneous resources. While many of the involved issues can be handled by the MPI implementation, some must be dealt with by the application program. This paper considers a master/slave application, in which MPI processes internally use a different number of threads created by OpenMP. We modify the standard master/slave pattern to allow for dynamic addition and withdrawal of slaves. Moreover, the application dynamically adapts to use processors for either processes or threads. The paper evaluates the support that MPI-2 provides for implementing the scheme, partly referring to experiments with the MPICH2 implementation. We found that most requirements can be met if optional parts of the standard are used, but slave crashes require additional functionality.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Feitelson, D.G., Rudolph, L.: Toward convergence in job schedulers for parallel supercomputers. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 1–26. Springer, Heidelberg (1996)
Leopold, C., Süß, M., Breitbart, J.: Programming for malleability with hybrid MPI-2 and OpenMP: Experiences with a simulation program for global water prognosis. In: High Performance Computing & Simulation Conference, pp. 665–670 (2006)
Gropp, W., et al.: MPICH2 User’s Guide, Version 1.0.3 (November 2005), Available at: http://www-unix.mcs.anl.gov/mpi/mpich2
Fagg, G.E., et al.: Process fault-tolerance: Semantics, design and applications for high performance computing. Int. Journal of High Performance Computing Applications 19(4), 465–478 (2005)
Gropp, W., Lusk, E.: Fault tolerance in message passing interface programs. Int. Journal of High Performance Computing Applications 18(3), 363–372 (2004)
Kalé, L.V., Kumar, S., DeSouza, J.: A malleable-job system for timeshared parallel machines. In: IEEE/ACM Int. Symp. on Cluster Computing and the Grid, pp. 230–237 (2002)
Vadhiyar, S.S., Dongarra, J.J.: SRS: A framework for developing malleable and migratable parallel applications for distributed systems. Parallel Processing Letters 13(2), 291–312 (2003)
Utrera, G., Corbalán, J., Labarta, J.: Implementing malleability on MPI jobs. In: Proc. Parallel Architectures and Compilation Techniques, pp. 215–224 (2004)
Goux, J.P., et al.: An enabling framework for master-worker applications on the computational grid. In: IEEE Int. Symp. on High Performance Distributed Computing, pp. 43–50 (2000)
Baratloo, A., et al.: Charlotte: Metacomputing on the web. In: Int. Conf. on Parallel and Distributed Computing Systems, pp. 181–188 (1996)
Wrzesińska, G., et al.: Fault-tolerance, malleability and migration for divide-and-conquer applications on the grid. In: IEEE Int. Parallel and Distributed Processing Symposium (2005)
Smith, L., Bull, M.: Development of mixed mode MPI/OpenMP applications. Scientific Programming 9(2–3), 83–98 (2001)
Rabenseifner, R.: Hybrid parallel programming on HPC platforms. In: European Workshop on OpenMP, pp. 185–194 (2003)
Spiegel, A., Mey, D.: Hybrid parallelization with dynamic thread balancing on a ccNUMA system. In: European Workshop on OpenMP, pp. 77–82 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leopold, C., Süß, M. (2006). Observations on MPI-2 Support for Hybrid Master/Slave Applications in Dynamic and Heterogeneous Environments. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2006. Lecture Notes in Computer Science, vol 4192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846802_41
Download citation
DOI: https://doi.org/10.1007/11846802_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39110-4
Online ISBN: 978-3-540-39112-8
eBook Packages: Computer ScienceComputer Science (R0)