More Web Proxy on the site http://driver.im/

research-article

Performance study of matrix computations using multi-core programming tools

Authors:

Panagiotis D. Michailidis,

Konstantinos G. MargaritisAuthors Info & Claims

BCI '12: Proceedings of the Fifth Balkan Conference in Informatics

Pages 186 - 192

https://doi.org/10.1145/2371316.2371353

Published: 16 September 2012 Publication History

Abstract

Basic matrix computations such as vector and matrix addition, dot product, outer product, matrix transpose, matrix - vector and matrix multiplication are very challenging computational kernels arising in scientific computing. In this paper, we parallelize those basic matrix computations using the multi-core and parallel programming tools. Specifically, these tools are Pthreads, OpenMP, Intel Cilk++, Intel TBB, Intel ArBB, SMPSs, SWARM and FastFlow. The purpose of this paper is to present an unified quantitative and qualitative study of these tools for parallel matrix computations on multicore. Finally, based on the performance results with compilation optimization we conclude that the Intel ArBB and SWARM parallel programming tools are the most appropriate because these give good performance and simplicity of programming. In particular, we conclude that the Intel ArBB is a good choice for implementing intensive computations such as matrix product because it gives significant speedup results over the serial implementation. On the other hand, the SWARM tool gives good performance results for implementing matrix operations of medium size such as vector addition, matrix addition, outer product and matrix - vector product.

References

[1]

SMP Superscaler User's Manual, version 2.4, 2011. http://www.bsc.es/media/4783.pdf.

[2]

Intel Array Building Blocks, 2012. http://software.intel.com/en-us/articles/intel-array-building-blocks/.

[3]

Intel Cilk Plus, 2012. http://software.intel.com/en-us/articles/intel-cilk-plus/.

[4]

Intel Threading Building Blocks, 2012. http://threadingbuildingblocks.org/.

[5]

The OpenMP API specification for parallel programming, 2012. http://openmp.org/wp/.

[6]

M. Aldinucci, M. Danelutto, P. Kilpatrick, and M. Torquati. Programming Multi-core and Many-core Computing Systems, chapter FastFlow: High-level and efficient streaming on multi-core. Wiley, 2011.

[7]

D. A. Bader and J. JaJa. Simple: A methodology for programming high performance algorithms on clusters of symmetric multiprocessors (SMPs). Journal of Parallel and Distributed Computing, 58:92--108, 1999.

Digital Library

[8]

D. A. Bader, V. Kanade, and K. Madduri. SWARM: A Parallel Programming Framework for Multicore Processors. In Parallel and Distributed Processing Symposium, 2007. IPDPS 2007, pages 1--8, 2007.

[9]

R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), pages 207--216, Santa Barbara, California, July 1995.

Digital Library

[10]

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra. A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput., 35(1):38--53, Jan. 2009.

Digital Library

[11]

D. Buttlar, J. Farrell, and B. Nichols. PThreads Programming: A POSIX Standard for Better Multiprocessing. O'Reilly Media, 1996.

Digital Library

[12]

E. Elmroth and F. Gustavson. High-performance library software for QR factorization. In In Applied Parallel Computing: New Paradigms for HPC in Industry and Academia, T. Sorvik et al., Eds., Lecture Notes in Comput. Sci. 1947, pages 53--63. Springer-Verlag, 2000.

Digital Library

[13]

G. Golub and C. Van Loan. Matrix Computations. Johns Hopkins University Press, 2nd edition, 1989.

[14]

J. Kurzak, H. Ltaief, J. Dongarra, and R. M. Badia. Scheduling dense linear algebra operations on multicore processors. Concurr. Comput.: Pract. Exper., 22(1):15--44, Jan. 2010.

Digital Library

[15]

M. Marqués, G. Quintana-Ortí, E. S. Quintana-Ortí, and R. Geijn. Out-of-core computation of the QR factorization on multi-core processors. In Proceedings of the 15th International Euro-Par Conference on Parallel Processing, Euro-Par '09, pages 809--820, Berlin, Heidelberg, 2009. Springer-Verlag.

Digital Library

[16]

S. F. McGinn and R. E. Shaw. Parallel Gaussian elimination using OpenMP and MPI. In Proceedings of the 16th Annual International Symposium on High Performance Computing Systems and Applications, HPCS '02, pages 169--, Washington, DC, USA, 2002. IEEE Computer Society.

Digital Library

[17]

G. Rünger and M. Schwind. Fast recursive matrix multiplication for multi-core architectures. Procedia CS, 1(1):67--76, 2010.

[18]

S. Zuckerman, M. Pérache, and W. Jalby. Fine tuning matrix multiplications on multicore. In Proceedings of the 15th international conference on High performance computing, HiPC'08, pages 30--41, Berlin, Heidelberg, 2008. Springer-Verlag.

Digital Library

Index Terms

Performance study of matrix computations using multi-core programming tools

Recommendations

Computational Comparison of Some Multi-core Programming Tools for Basic Matrix Computations
HPCC '12: Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems

The broad introduction of multi-core platforms into computing has brought a great opportunity to develop computationally demanding applications such as matrix computations on parallel computing platforms. Basic matrix computations such as vector and ...
Scientific computations on multi-core systems using different programming frameworks

Numerical linear algebra is one of the most important forms of scientific computation. The basic computations in numerical linear algebra are matrix computations and linear systems solution. These computations are used as kernels in many computational ...
Performance Gaps between OpenMP and OpenCL for Multi-core CPUs
ICPPW '12: Proceedings of the 2012 41st International Conference on Parallel Processing Workshops

OpenCL and OpenMP are the most commonly used programming models for multi-core processors. They are also fundamentally different in their approach to parallelization. In this paper, we focus on comparing the performance of OpenCL and OpenMP. We select ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

BCI '12: Proceedings of the Fifth Balkan Conference in Informatics

September 2012

312 pages

ISBN:9781450312400

DOI:10.1145/2371316

General Chair:
Mirjana Ivanović
Univ. of Novi Sad, Serbia
,
Program Chair:
Zoran Budimac
Univ. of Novi Sad, Serbia

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

MSTD: Ministry of Education, Science and Technological Development - Serbia
Novi Sad: Faculty of Technical Sciences, University of Novi Sad

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

BCI '12

Sponsor:

MSTD
Novi Sad

BCI '12: Balkan Conference in Informatics, 2012

September 16 - 20, 2012

Novi Sad, Serbia

Acceptance Rates

Overall Acceptance Rate 97 of 250 submissions, 39%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
113
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents