[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization

  • Conference paper
High Performance Computing for Computational Science - VECPAR 2008 (VECPAR 2008)

Abstract

We pursue the scalable parallel implementation of the factor- ization of band matrices with medium to large bandwidth targeting SMP and multi-core architectures. Our approach decomposes the computation into a large number of fine-grained operations exposing a higher degree of parallelism. The SuperMatrix run-time system allows an out-of-order scheduling of operations that is transparent to the programmer. Exper- imental results for the Cholesky factorization of band matrices on two parallel platforms with sixteen processors demonstrate the scalability of the solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Anderson, E., Bai, Z., Demmel, J., Dongarra, J.E., DuCroz, J., Greenbaum, A., Hammarling, S., McKenney, A.E., Ostrouchov, S., Sorensen, D.: LAPACK Users Guide. SIAM, Philadelphia (1992)

    MATH  Google Scholar 

  2. Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: CellSs: A programming model for the Cell BE architecture. In: SC 2006: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 86. ACM Press, New York (2006)

    Google Scholar 

  3. Bientinesi, P., Gunnels, J.A., Myers, M.E., Quintana-Ortí, E.S., van de Geijn, R.A.: The science of deriving dense linear algebra algo- rithms. ACM Transactions on Mathematical Software 31(1), 1–26 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bientinesi, P., Quintana-Ortí, E.S., van de Geijn, R.A.: Repre- senting linear algebra algorithms in code: The FLAME application programming interfaces. ACM Trans. Math. Soft. 31(1), 27–59 (2005)

    Article  Google Scholar 

  5. Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of par- allel tiled linear algebra algorithms for multicore architectures. LAPACK Working Note 191 UT-CS-07-600, University of Tennessee (September 2007)

    Google Scholar 

  6. Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: Parallel tiled qr factorization for multicore architectures. LAPACK Working Note 190 UT-CS- 07-598, University of Tennessee (July 2007)

    Google Scholar 

  7. Chan, E., Quintana-Ortí, E., Quintana-Ortí, G., van de Geijn, R.: SuperMatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures. In: SPAA 2007: Proceedings of the Nineteenth ACM Sym- posium on Parallelism in Algorithms and Architectures, pp. 116–126 (2007)

    Google Scholar 

  8. Chan, E., Van Zee, F.G., Bientinesi, P., Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R.: SuperMatrix: A multithreaded runtime scheduling system for algorithms-by-blocks. In: PPoPP 2008: Proceedings of the 13th ACM SIGPLAN symposium on Principles and practices of parallel programming. ACM Press, New York (to appear, 2008)

    Google Scholar 

  9. Chan, E., Van Zee, F., van de Geijn, R., Quintana-Ortí, E.S., Quintana-Ortí, G.: Satisfying your dependencies with SuperMatrix. In: IEEE Cluster 2007, pp. 92–99 (2007)

    Google Scholar 

  10. Chatterjee, S., Lebeck, A.R., Patnala, P.K., Thottethodi, M.: Recursive array layouts and fast matrix multiplication. IEEE Trans. on Parallel and Distributed Systems 13(11), 1105–1123 (2002)

    Article  Google Scholar 

  11. Dongarra, J.J., Duff, I.S., Sorensen, D.C., van der Vorst, H.A.: Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia (1991)

    MATH  Google Scholar 

  12. Elmroth, E., Gustavson, F., Gustavson, F., Jonsson, I., Kagstrom, B.: Recursive blocked algorithms and hybrid data structures for dense matrix library software. SIAM Review 46(1), 3–45 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Henry, G.: BLAS based on block data structures. Theory Center Technical Report CTC92TR89, Cornell University (February 1992)

    Google Scholar 

  14. Leiserson, C., Plaat, A.: Programming parallel applications in Cilk. SIAM News, SINEWS (1998)

    Google Scholar 

  15. Low, T.M., van de Geijn, R.: An API for manipulating matrices stored by blocks. Technical Report TR-2004-15, Department of Computer Sciences, The University of Texas at Austin (May 2004)

    Google Scholar 

  16. Park, N., Hong, B., Prasanna, V.K.: Tiling, block data layout, and memory hier- archy performance. IEEE Trans. on Parallel and Distributed Systems 14(7), 640–654 (2003)

    Article  Google Scholar 

  17. Quintana-Ortí, G., Quintana-Ortí, E., Chan, E., Van Zee, F.G., van de Geijn, R.: Scheduling of QR factorization algorithms on SMP and multi-core architectures. In: 16th Euromicro Int. Conference on Parallel, Dis- tributed and Network-based Processing. IEEE, Los Alamitos (to appear, 2008)

    Google Scholar 

  18. Quintana-Ortí, G., Quintana-Ortí, E.S., Chan, E., van de Geijn, R., Van Zee, F.G.: Design and scheduling of an algorithm-by-blocks for the LU factorization on multithreaded architectures. FLAME Working Note #26 TR-07-50, The University of Texas at Austin, Department of Computer Sciences (September 2007)

    Google Scholar 

  19. Quintana-Ortí, G., Quintana-Ortí, E.S., Remón, A., van de Geijn, R.: SuperMatrix for the factorization of band matrices. FLAMEWorking Note #27 TR-07-51, The University of Texas at Austin, Department of Computer Sciences (September 2007)

    Google Scholar 

  20. Remón, A., Quintana-Ortí, E.S., Quintana-Ortí, G.: Cholesky factorization of band matrices using multithreaded BLAS. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 608–616. Springer, Heidelberg (to appear, 2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Quintana-Ortí, G., Quintana-Ortí, E.S., Remón, A., van de Geijn, R.A. (2008). An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization. In: Palma, J.M.L.M., Amestoy, P.R., Daydé, M., Mattoso, M., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2008. VECPAR 2008. Lecture Notes in Computer Science, vol 5336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92859-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-92859-1_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-92858-4

  • Online ISBN: 978-3-540-92859-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics