[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-07312-0_17guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Hybrid Parallel ILU Preconditioner in Linear Solver Library GaspiLS

Published: 29 May 2022 Publication History

Abstract

Krylov subspace solvers such as GMRES and preconditioners such as incomplete LU (ILU) are the most commonly used methods to solve general-purpose, large-scale linear systems in simulations efficiently. Parallel Krylov subspace solvers and preconditioners with good scalability features are required to exploit the increasing parallelism provided by modern hardware fully. As such, they are crucial for productivity. They provide a high-level abstraction to the details of a complex hybrid parallel implementation which is easy to use for the domain expert. However, the ILU factorization and the subsequent triangular solve are sequential in their basic form. We use a multilevel nested dissection (MLND) ordering to resolve that issue and expose some parallelism. We investigate the parallel efficiency of a hybrid parallel ILU preconditioner that combines a restricted additive Schwarz (RAS) method on the process level with a shared memory parallel MLND Crout ILU method on the thread level. We employ the PGAS based programming model GASPI to efficiently implement the data exchange across processes. We demonstrate the scalability of our approach for the convection-diffusion problem as a representative of a large class of engineering problems up to 64 sockets (1280 cores) and show comparable baseline performance against the linear solver library PETSc. The RAS preconditioned GMRES solver achieves about 80% parallel efficiency on 1280 cores. Our implementation provides a generic, algebraic, scalable, and efficient preconditioner that enables productivity for the domain expert in solving large-scale sparse linear systems.

References

[1]
Threads and PETSc (2021). https://petsc.org/release/miscellaneous/threads/. Accessed 14 Dec 2021
[2]
Agullo, E., Giraud, L., Guermouche, A., Haidar, A., Roman, J.: MaPHyS or the development of a parallel algebraic domain decomposition solver in the course of the solstice project. In: Sparse Days 2010 Meeting at CERFACS (2010)
[3]
Aliaga JI, Bollhöfer M, Martı AF, Quintana-Ortı ES, et al. Exploiting thread-level parallelism in the iterative solution of sparse linear systems Parallel Comput. 2011 37 3 183-202
[4]
Aliaga JI, Bollhöfer M, Martín AF, and Quintana-Ortí ES Palma JMLM, Amestoy PR, Daydé M, Mattoso M, and Lopes JC Design, tuning and evaluation of parallel multilevel ILU preconditioners High Performance Computing for Computational Science - VECPAR 2008 2008 Heidelberg Springer 314-327
[5]
Balay, S., et al.: Petsc users manual (2019)
[6]
Belli, R., Hoefler, T.: Notified access: extending remote memory access programming models for producer-consumer synchronization. In: 2015 IEEE International Parallel and Distributed Processing Symposium, pp. 871–881. IEEE (2015)
[7]
Bollhöfer, M., Saad, Y., Schenk, O.: Ilupack-preconditioning software package. Release 2 (2006). http://ilupack.tu-bs.de/
[8]
Cai XC and Sarkis M A restricted additive Schwarz preconditioner for general sparse linear systems SIAM J. Sci. Comput. 1999 21 2 792-797
[9]
Chen, Q., Ghai, A., Jiao, X.: HILUCSI: simple, robust, and fast multilevel ILU for large-scale saddle-point problems from PDEs. Numer. Linear Algebra Appl. 28, e2400 (2021)
[10]
Chow E and Patel A Fine-grained parallel incomplete LU factorization SIAM J. Sci. Comput. 2015 37 2 C169-C193
[11]
Efstathiou E and Gander MJ Why restricted additive Schwarz converges faster than additive Schwarz BIT Numer. Math. 2003 43 5 945-959
[12]
Falgout RD, Jones JE, and Yang UM Bruaset AM and Tveito A The design and implementation of hypre, a library of parallel high performance preconditioners Numerical Solution of Partial Differential Equations on Parallel Computers 2006 Berlin Springer 267-294
[13]
Forum, G.: GASPI forum - forum of the PGAS API GASPI (2020). http://www.gaspi.de
[14]
Ghai, A., Jiao, X.: Robust optimal-complexity multilevel ilu for predominantly symmetric systems. arXiv preprint arXiv:1901.03249 (2019)
[15]
Giraud, L., Tuminaro, R.: Algebraic domain decomposition preconditioners. In: Magoules, F. (ed.) Mesh Partitioning Techniques And Domain Decomposition Methods, pp. 187–216. Saxe-Coburg Publications, Kippen (2006)
[16]
Grünewald, D., Simmendinger, C.: The GASPI API specification and its implementation GPI 2.0. In: Proceedings of the 7th International Conference on PGAS Programming Models, vol. 243 (2013)
[17]
Heroux MA, Bartlett RA, Howle VE, Hoekstra RJ, Hu JJ, Kolda TG, Lehoucq RB, Long KR, Pawlowski RP, Phipps ET, et al. An overview of the trilinos project ACM Trans. Math. Softw. 2005 31 3 397-423
[18]
ITWM Fraunhofer: GaspiLS - a linear solver for the Exascale Era (2020). https://www.gaspils.de
[19]
ITWM Fraunhofe: GPI-2 - Programming next generation supercomputers (2020). http://www.gpi-site.com
[20]
Karypis, G., Kumar, V.: METIS: A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. Technical Report; 97-061 (1997)
[21]
Karypis G and Kumar V A fast and high quality multilevel scheme for partitioning irregular graphs SIAM J. Sci. Comput. 1998 20 1 359-392
[22]
LaSalle D and Karypis G Träff JL, Hunold S, and Versaci F Efficient nested dissection for multicore architectures Euro-Par 2015: Parallel Processing 2015 Heidelberg Springer 467-478
[23]
Leicht, T., Jägersküpper, J., Vollmer, D., Schwöppe, A., Hartmann, R., Fiedler, J., Schlauch, T.: DLR-project digital-X-next generation CFD solver ’flucs’ (2016)
[24]
Li N, Saad Y, and Chow E Crout versions of ILU for general sparse matrices SIAM J. Sci. Comput. 2003 25 2 716-728
[25]
Prokopenko, A., Siefert, C.M., Hu, J.J., Hoemmen, M., Klinvex, A.: Ifpack2 User’s Guide 1.0. Tech. Rep. SAND2016-5338, Sandia National Labs (2016)
[26]
Rajamanickam, S., Boman, E.G., Heroux, M.A.: ShyLU: a hybrid-hybrid solver for multicore platforms. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp. 631–643 (2012).
[27]
Ram, R., Grünewald, D., Gauger, N.R.: Data structures to implement the Sparse Vector in Crout ILU preconditioner (2019), Sparse Days 2019
[28]
Simmendinger, C., Rahn, M., Gruenewald, D.: The GASPI API: a failure tolerant PGAS API for Asynchronous Dataflow on heterogeneous architectures. In: Resch, M., Bez, W., Focht, E., Kobayashi, H., Patel, N. (eds.) Sustained Simulation Performance 2014, pp. 17–32. Springer, Cham (2015).
[29]
Stoyanov, D., Pfreundt, F.J.: Hybrid-parallel sparse matrix-vector multiplication and iterative linear solvers with the communication library GPI. WSEAS Trans. Inf. Sci. Appl. 11 (2014)
[30]
Yamazaki, I., Ng, E., Li, X.: Pdslin user guide. Tech. rep., Lawrence Berkeley National Lab. (LBNL), Berkeley, CA, USA (2011)

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
High Performance Computing: 37th International Conference, ISC High Performance 2022, Hamburg, Germany, May 29 – June 2, 2022, Proceedings
May 2022
382 pages
ISBN:978-3-031-07311-3
DOI:10.1007/978-3-031-07312-0

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 29 May 2022

Author Tags

  1. Sparse linear systems
  2. Parallel ILU preconditioner
  3. Domain decomposition
  4. GASPI
  5. METIS
  6. Hybrid parallelism
  7. Task-level parallelism

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media