[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2371316.2371353acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbciConference Proceedingsconference-collections
research-article

Performance study of matrix computations using multi-core programming tools

Published: 16 September 2012 Publication History

Abstract

Basic matrix computations such as vector and matrix addition, dot product, outer product, matrix transpose, matrix - vector and matrix multiplication are very challenging computational kernels arising in scientific computing. In this paper, we parallelize those basic matrix computations using the multi-core and parallel programming tools. Specifically, these tools are Pthreads, OpenMP, Intel Cilk++, Intel TBB, Intel ArBB, SMPSs, SWARM and FastFlow. The purpose of this paper is to present an unified quantitative and qualitative study of these tools for parallel matrix computations on multicore. Finally, based on the performance results with compilation optimization we conclude that the Intel ArBB and SWARM parallel programming tools are the most appropriate because these give good performance and simplicity of programming. In particular, we conclude that the Intel ArBB is a good choice for implementing intensive computations such as matrix product because it gives significant speedup results over the serial implementation. On the other hand, the SWARM tool gives good performance results for implementing matrix operations of medium size such as vector addition, matrix addition, outer product and matrix - vector product.

References

[1]
SMP Superscaler User's Manual, version 2.4, 2011. http://www.bsc.es/media/4783.pdf.
[2]
Intel Array Building Blocks, 2012. http://software.intel.com/en-us/articles/intel-array-building-blocks/.
[3]
Intel Cilk Plus, 2012. http://software.intel.com/en-us/articles/intel-cilk-plus/.
[4]
Intel Threading Building Blocks, 2012. http://threadingbuildingblocks.org/.
[5]
The OpenMP API specification for parallel programming, 2012. http://openmp.org/wp/.
[6]
M. Aldinucci, M. Danelutto, P. Kilpatrick, and M. Torquati. Programming Multi-core and Many-core Computing Systems, chapter FastFlow: High-level and efficient streaming on multi-core. Wiley, 2011.
[7]
D. A. Bader and J. JaJa. Simple: A methodology for programming high performance algorithms on clusters of symmetric multiprocessors (SMPs). Journal of Parallel and Distributed Computing, 58:92--108, 1999.
[8]
D. A. Bader, V. Kanade, and K. Madduri. SWARM: A Parallel Programming Framework for Multicore Processors. In Parallel and Distributed Processing Symposium, 2007. IPDPS 2007, pages 1--8, 2007.
[9]
R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 207--216, Santa Barbara, California, July 1995.
[10]
A. Buttari, J. Langou, J. Kurzak, and J. Dongarra. A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput., 35(1):38--53, Jan. 2009.
[11]
D. Buttlar, J. Farrell, and B. Nichols. PThreads Programming: A POSIX Standard for Better Multiprocessing. O'Reilly Media, 1996.
[12]
E. Elmroth and F. Gustavson. High-performance library software for QR factorization. In In Applied Parallel Computing: New Paradigms for HPC in Industry and Academia, T. Sorvik et al., Eds., Lecture Notes in Comput. Sci. 1947, pages 53--63. Springer-Verlag, 2000.
[13]
G. Golub and C. Van Loan. Matrix Computations. Johns Hopkins University Press, 2nd edition, 1989.
[14]
J. Kurzak, H. Ltaief, J. Dongarra, and R. M. Badia. Scheduling dense linear algebra operations on multicore processors. Concurr. Comput.: Pract. Exper., 22(1):15--44, Jan. 2010.
[15]
M. Marqués, G. Quintana-Ortí, E. S. Quintana-Ortí, and R. Geijn. Out-of-core computation of the QR factorization on multi-core processors. In Proceedings of the 15th International Euro-Par Conference on Parallel Processing, Euro-Par '09, pages 809--820, Berlin, Heidelberg, 2009. Springer-Verlag.
[16]
S. F. McGinn and R. E. Shaw. Parallel Gaussian elimination using OpenMP and MPI. In Proceedings of the 16th Annual International Symposium on High Performance Computing Systems and Applications, HPCS '02, pages 169--, Washington, DC, USA, 2002. IEEE Computer Society.
[17]
G. Rünger and M. Schwind. Fast recursive matrix multiplication for multi-core architectures. Procedia CS, 1(1):67--76, 2010.
[18]
S. Zuckerman, M. Pérache, and W. Jalby. Fine tuning matrix multiplications on multicore. In Proceedings of the 15th international conference on High performance computing, HiPC'08, pages 30--41, Berlin, Heidelberg, 2008. Springer-Verlag.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
BCI '12: Proceedings of the Fifth Balkan Conference in Informatics
September 2012
312 pages
ISBN:9781450312400
DOI:10.1145/2371316
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • MSTD: Ministry of Education, Science and Technological Development - Serbia
  • Novi Sad: Faculty of Technical Sciences, University of Novi Sad

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. linear algebra
  2. matrix computations
  3. multi-core
  4. parallel programming

Qualifiers

  • Research-article

Conference

BCI '12
Sponsor:
  • MSTD
  • Novi Sad
BCI '12: Balkan Conference in Informatics, 2012
September 16 - 20, 2012
Novi Sad, Serbia

Acceptance Rates

Overall Acceptance Rate 97 of 250 submissions, 39%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 113
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media