Keyword: ScaLAPACK : Search

research-article

Performance improvement of the triangular matrix product in commodity clusters

The Journal of Supercomputing (JSCO), Volume 80, Issue 11Pages 16630–16653https://doi.org/10.1007/s11227-024-06097-7

Abstract

There are many works devoted to improving the matrix product computation, as it is used in a wide variety of scientific applications arising from many different fields. In this work, we propose alternative data distribution policies and ...

research-article

Open Access

Energy consumption comparison of parallel linear systems solver algorithms on HPC infrastructure

SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and AnalysisPages 1839–1848https://doi.org/10.1145/3624062.3624266

High-Performance Computing (HPC) systems today are gradually increasing in size and complexity due to the correspondent demand for ever-increasing computing needs, requiring more complicated tasks and higher accuracy. The growing energy needs of HPC ...

Article

Automatic Code Selection for the Dense Symmetric Generalized Eigenvalue Problem Using ATMathCoreLib

Parallel Processing and Applied MathematicsPages 453–463https://doi.org/10.1007/978-3-031-30442-2_34

Abstract

Solution of the symmetric definite generalized eigenvalue problem (GEP) $A x = λ B x$ lies at the heart of many scientific computations like electronic structure calculations. The standard algorithm for this problem consists of two parts, namely, ...

research-article

Improving blocked matrix-matrix multiplication routine by utilizing AVX-512 instructions on intel knights landing and xeon scalable processors

Cluster Computing (KLU-CLUS), Volume 26, Issue 5Pages 2539–2549https://doi.org/10.1007/s10586-021-03274-8

Abstract

In high-performance computing, the general matrix-matrix multiplication (xGEMM) routine is the core of the Level 3 BLAS kernel for effective matrix-matrix multiplication operations. The performance of parallel xGEMM (PxGEMM) is significantly ...

research-article

A high performance implementation of Zolo-SVD algorithm on distributed memory systems

Parallel Computing (PACO), Volume 86, Issue CPages 57–65https://doi.org/10.1016/j.parco.2019.04.004

Abstract

This paper introduces a high performance implementation of the Zolo-SVD algorithm on distributed memory systems, which is based on the polar decomposition (PD) algorithm via the Zolotarev’s function (Zolo-PD), originally proposed by ...

research-article

An efficient hybrid tridiagonal divide-and-conquer algorithm on distributed memory architectures

Journal of Computational and Applied Mathematics (JCAM), Volume 344, Issue CPages 512–520https://doi.org/10.1016/j.cam.2018.05.051

Abstract

In this paper, we propose an efficient divide-and-conquer (DC) algorithm for symmetric tridiagonal matrices based on ScaLAPACK and the hierarchically semiseparable (HSS) matrices. HSS is an important type of rank-structured matrices. ...

research-article

Parallel reduction to hessenberg form with algorithm-based fault tolerance

SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and AnalysisArticle No.: 88, Pages 1–11https://doi.org/10.1145/2503210.2503249

This paper studies the resilience of a two-sided factorization and presents a generic algorithm-based approach capable of making two-sided factorizations resilient. We establish the theoretical proof of the correctness and the numerical stability of the ...

Article

Tight Coupling of R and Distributed Linear Algebra for High-Level Programming with Big Data

SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and AnalysisPages 811–815https://doi.org/10.1109/SC.Companion.2012.113

We present a new distributed programming extension of the R programming language. By tightly coupling R to the well-known ScaLAPACK and MPI libraries, we are able to achieve highly scalable implementations of common statistical methods, allowing the ...

article

Parallel, 'large', dense matrix problems: Application to 3D sequential integrated inversion of seismological and gravity data

Computers & Geosciences (CGEO), Volume 48Pages 143–156https://doi.org/10.1016/j.cageo.2012.05.026

To obtain accurate and reliable estimations of the major lithological properties of the rock within a studied volume, geophysics uses the joint information provided by different geophysical datasets (e.g. gravimetric, magnetic, seismic). Representation ...

Article

Energy Efficient Parallel Matrix-Matrix Multiplication for DVFS-enabled Clusters

ICPPW '12: Proceedings of the 2012 41st International Conference on Parallel Processing WorkshopsPages 239–245https://doi.org/10.1109/ICPPW.2012.36

Excessive energy consumption has become one of the major challenges in high performance computing. Reducing the energy consumption of frequently used high performance computing applications not only saves the energy cost but also reduces the greenhouse ...

Article

Design and Performance Issues of Cholesky and LU Solvers Using UPCBLAS

ISPA '12: Proceedings of the 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with ApplicationsPages 40–47https://doi.org/10.1109/ISPA.2012.14

Partitioned Global Address Space (PGAS) languages offer programmers a shared memory view that increases their productivity and allow locality exploitation to obtain good performance on current large-scale distributed memory systems. UPCBLAS is a ...

Article

Incomplete cyclic reduction of banded and strictly diagonally dominant linear systems

PPAM'11: Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part IPages 80–91https://doi.org/10.1007/978-3-642-31464-3_9

The ScaLAPACK library contains a pair of routines for solving banded linear systems which are strictly diagonally dominant by rows. Mathematically, the algorithm is complete block cyclic reduction corresponding to a particular block partitioning of the ...

Article

Parallel solution of narrow banded diagonally dominant linear systems

PARA'10: Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2Pages 280–290https://doi.org/10.1007/978-3-642-28145-7_28

ScaLAPACK contains a pair of routines for solving systems which are narrow banded and diagonally dominant by rows. Mathematically, the algorithm is block cyclic reduction. The ScaLAPACK implementation can be improved using incomplete, rather than ...

research-article

ScaLAPACK's MRRR algorithm

Christof Vömel

ACM Transactions on Mathematical Software (TOMS), Volume 37, Issue 1Article No.: 1, Pages 1–35https://doi.org/10.1145/1644001.1644002

The (sequential) algorithm of Multiple Relatively Robust Representations, MRRR, is a more efficient variant of inverse iteration that does not require reorthogonalization. It solves the eigenproblem of an unreduced symmetric tridiagonal matrix T ∈ R^{n × ...}

article

Interfaces for parallel numerical linear algebra libraries in high level languages

Advances in Engineering Software (ADES), Volume 40, Issue 8Pages 652–658https://doi.org/10.1016/j.advengsoft.2008.11.014

In many high performance engineering and scientific applications there is a need to use parallel software libraries. Researchers behind these applications find it difficult to understand the interfaces to these libraries because they carry arguments ...

Article

Parallel Algorithms for Triangular Periodic Sylvester-Type Matrix Equations

Euro-Par '08: Proceedings of the 14th international Euro-Par conference on Parallel ProcessingPages 780–789https://doi.org/10.1007/978-3-540-85451-7_83

We present parallel algorithms for triangular periodic Sylves-ter-type matrix equations, conceptually being the third step of a periodic Bartels---Stewart-like solution method for general periodic Sylvester-type matrix equations based on variants of the ...

Article

Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters

ISPDC '08: Proceedings of the 2008 International Symposium on Parallel and Distributed ComputingPages 73–80https://doi.org/10.1109/ISPDC.2008.9

This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for heterogeneous computational clusters. We present the user interface and the ...

Article

Scalable Dense Factorizations for Heterogeneous Computational Clusters

ISPDC '08: Proceedings of the 2008 International Symposium on Parallel and Distributed ComputingPages 49–56https://doi.org/10.1109/ISPDC.2008.10

This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These routines are used in the factorization and solution of a dense system of ...

opinion

Biographies

IEEE Annals of the History of Computing (ANHC), Volume 30, Issue 2Pages 74–81https://doi.org/10.1109/MAHC.2008.17

Jack Dongarra, a leader of the high-performance computing community, is cocreator of mathematical software packages including EISPACK, LINPACK, LAPACK, and ScaLAPACK. He is also a University Distinguished Professor at the University of Tennessee.

poster

Block size selection of parallel LU and QR on PVP-based and RISC-based supercomputers

CHINA HPC '07: Proceedings of the 2007 Asian technology information program's (ATIP's) 3rd workshop on High performance computing in China: solution approaches to impediments for high performance computingPages 115–125https://doi.org/10.1145/1375783.1375809

In this paper, we proposed a unified framework and tried to address the optimal block size selection problem for parallel blocked LU and QR factorization algorithm used in ScaLAPACK package, since they use two dimensional block cyclic data distribution ...

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences