An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

Gregorio Quintana-Ortí⁶,
Enrique S. Quintana-Ortí⁶,
Alfredo Remón⁶ &
…
Robert A. van de Geijn⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5336))

Included in the following conference series:

International Conference on High Performance Computing for Computational Science

1216 Accesses
3 Citations

Abstract

We pursue the scalable parallel implementation of the factor- ization of band matrices with medium to large bandwidth targeting SMP and multi-core architectures. Our approach decomposes the computation into a large number of fine-grained operations exposing a higher degree of parallelism. The SuperMatrix run-time system allows an out-of-order scheduling of operations that is transparent to the programmer. Exper- imental results for the Cholesky factorization of band matrices on two parallel platforms with sixteen processors demonstrate the scalability of the solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Parallelized Solution of Banded Linear Systems with an Introduction to p-adic Computation

A Parallel Multi-threaded Solver for Symmetric Positive Definite Bordered-Band Linear Systems

An asynchronous direct solver for banded linear systems

Article 04 January 2017

References

Anderson, E., Bai, Z., Demmel, J., Dongarra, J.E., DuCroz, J., Greenbaum, A., Hammarling, S., McKenney, A.E., Ostrouchov, S., Sorensen, D.: LAPACK Users Guide. SIAM, Philadelphia (1992)
MATH Google Scholar
Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: CellSs: A programming model for the Cell BE architecture. In: SC 2006: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 86. ACM Press, New York (2006)
Google Scholar
Bientinesi, P., Gunnels, J.A., Myers, M.E., Quintana-Ortí, E.S., van de Geijn, R.A.: The science of deriving dense linear algebra algo- rithms. ACM Transactions on Mathematical Software 31(1), 1–26 (2005)
Article MathSciNet MATH Google Scholar
Bientinesi, P., Quintana-Ortí, E.S., van de Geijn, R.A.: Repre- senting linear algebra algorithms in code: The FLAME application programming interfaces. ACM Trans. Math. Soft. 31(1), 27–59 (2005)
Article Google Scholar
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of par- allel tiled linear algebra algorithms for multicore architectures. LAPACK Working Note 191 UT-CS-07-600, University of Tennessee (September 2007)
Google Scholar
Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: Parallel tiled qr factorization for multicore architectures. LAPACK Working Note 190 UT-CS- 07-598, University of Tennessee (July 2007)
Google Scholar
Chan, E., Quintana-Ortí, E., Quintana-Ortí, G., van de Geijn, R.: SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In: SPAA 2007: Proceedings of the Nineteenth ACM Sym- posium on Parallelism in Algorithms and Architectures, pp. 116–126 (2007)
Google Scholar
Chan, E., Van Zee, F.G., Bientinesi, P., Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R.: SuperMatrix: A multithreaded runtime scheduling system for algorithms-by-blocks. In: PPoPP 2008: Proceedings of the 13th ACM SIGPLAN symposium on Principles and practices of parallel programming. ACM Press, New York (to appear, 2008)
Google Scholar
Chan, E., Van Zee, F., van de Geijn, R., Quintana-Ortí, E.S., Quintana-Ortí, G.: Satisfying your dependencies with SuperMatrix. In: IEEE Cluster 2007, pp. 92–99 (2007)
Google Scholar
Chatterjee, S., Lebeck, A.R., Patnala, P.K., Thottethodi, M.: Recursive array layouts and fast matrix multiplication. IEEE Trans. on Parallel and Distributed Systems 13(11), 1105–1123 (2002)
Article Google Scholar
Dongarra, J.J., Duff, I.S., Sorensen, D.C., van der Vorst, H.A.: Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia (1991)
MATH Google Scholar
Elmroth, E., Gustavson, F., Gustavson, F., Jonsson, I., Kagstrom, B.: Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Review 46(1), 3–45 (2004)
Article MathSciNet MATH Google Scholar
Henry, G.: BLAS based on block data structures. Theory Center Technical Report CTC92TR89, Cornell University (February 1992)
Google Scholar
Leiserson, C., Plaat, A.: Programming parallel applications in Cilk. SIAM News, SINEWS (1998)
Google Scholar
Low, T.M., van de Geijn, R.: An API for manipulating matrices stored by blocks. Technical Report TR-2004-15, Department of Computer Sciences, The University of Texas at Austin (May 2004)
Google Scholar
Park, N., Hong, B., Prasanna, V.K.: Tiling, block data layout, and memory hier- archy performance. IEEE Trans. on Parallel and Distributed Systems 14(7), 640–654 (2003)
Article Google Scholar
Quintana-Ortí, G., Quintana-Ortí, E., Chan, E., Van Zee, F.G., van de Geijn, R.: Scheduling of QR factorization algorithms on SMP and multi-core architectures. In: 16th Euromicro Int. Conference on Parallel, Dis- tributed and Network-based Processing. IEEE, Los Alamitos (to appear, 2008)
Google Scholar
Quintana-Ortí, G., Quintana-Ortí, E.S., Chan, E., van de Geijn, R., Van Zee, F.G.: Design and scheduling of an algorithm-by-blocks for the LU factorization on multithreaded architectures. FLAME Working Note #26 TR-07-50, The University of Texas at Austin, Department of Computer Sciences (September 2007)
Google Scholar
Quintana-Ortí, G., Quintana-Ortí, E.S., Remón, A., van de Geijn, R.: SuperMatrix for the factorization of band matrices. FLAMEWorking Note #27 TR-07-51, The University of Texas at Austin, Department of Computer Sciences (September 2007)
Google Scholar
Remón, A., Quintana-Ortí, E.S., Quintana-Ortí, G.: Cholesky factorization of band matrices using multithreaded BLAS. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 608–616. Springer, Heidelberg (to appear, 2007)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Ingeniería y Ciencia de Computadores, Universidad Jaume I, 12.071-Castellón, Spain
Gregorio Quintana-Ortí, Enrique S. Quintana-Ortí & Alfredo Remón
Department of Computer Sciences, The University of Texas at Austin, Austin, Texas, 78712
Robert A. van de Geijn

Authors

Gregorio Quintana-Ortí
View author publications
You can also search for this author in PubMed Google Scholar
Enrique S. Quintana-Ortí
View author publications
You can also search for this author in PubMed Google Scholar
Alfredo Remón
View author publications
You can also search for this author in PubMed Google Scholar
Robert A. van de Geijn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculdade de Engenharia da Universidade do Porto, Rua Dr. Roberto Frias s/n, 4200-465, Porto, Portugal
José M. Laginha M. Palma
University of Toulouse, INP (ENSEEIHT), IRIT, 2 rue Charles-Camichel, 31071, Toulouse CEDEX 7, France
Patrick R. Amestoy
University of Toulouse, INP (ENSEEIHT); IRIT, rue Charles-Camichel, 31071, Toulouse CEDEX 7, France
Michel Daydé
Federal University of Rio de Janeiro, P.O. Box 68511, 21941-972, Rio de Janeiro, RJ, Brazil
Marta Mattoso
Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, s/n, 4200-465, Porto, Portugal
João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Quintana-Ortí, G., Quintana-Ortí, E.S., Remón, A., van de Geijn, R.A. (2008). An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2008. VECPAR 2008. Lecture Notes in Computer Science, vol 5336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92859-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-540-92859-1_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92858-4
Online ISBN: 978-3-540-92859-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Parallelized Solution of Banded Linear Systems with an Introduction to p-adic Computation

A Parallel Multi-threaded Solver for Symmetric Positive Definite Bordered-Band Linear Systems

An asynchronous direct solver for banded linear systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Parallelized Solution of Banded Linear Systems with an Introduction to p-adic Computation

A Parallel Multi-threaded Solver for Symmetric Positive Definite Bordered-Band Linear Systems

An asynchronous direct solver for banded linear systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation