Buluç et al., 2009 - Google Patents

Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

Buluç et al., 2009

Document ID: 11040762343594802863
Author: Buluç A; Fineman J; Frigo M; Gilbert J; Leiserson C
Publication year: 2009
Publication venue: Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

External Links

Cited by

Snippet

This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A, x to be computed efficiently in parallel, where A is an n× n sparse matrix with nnz en nonzeros and x is a dense n-vector. Our algorithms use Θ …

Continue reading at crd.lbl.gov (PDF) (other versions)

239000011159 matrix material 0 abstract description 93

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30442—Query optimisation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Programme synchronisation; Mutual exclusion, e.g. by means of semaphores; Contention for resources among tasks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application

Similar Documents

Publication	Publication Date	Title
Buluç et al.	2009	Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks
Hou et al.	2017	Fast segmented sort on gpus
Demmel et al.	2012	Communication-optimal parallel and sequential QR and LU factorizations
Boman et al.	2012	The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: Partitioning, ordering and coloring
Buluç et al.	2011	Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication
Kane et al.	2013	Scalable strategies for computing with massive data
Boman et al.	2013	Scalable matrix computations on large scale-free graphs using 2D graph partitioning
Solomonik et al.	2013	Cyclops tensor framework: Reducing communication and eliminating load imbalance in massively parallel contractions
Merrill et al.	2012	Scalable GPU graph traversal
Yzelman et al.	2013	High-level strategies for parallel shared-memory sparse matrix-vector multiplication
Yang et al.	2013	A peta-scalable CPU-GPU algorithm for global atmospheric simulations
Sarıyüce et al.	2015	Regularizing graph centrality computations
Satish et al.	2012	Large-scale energy-efficient graph traversal: a path to efficient data-intensive supercomputing
Barnard	1995	PMRSB: Parallel multilevel recursive spectral bisection
Yeralan et al.	2017	Algorithm 980: Sparse QR factorization on the GPU
Elafrou et al.	2018	SparseX: A library for high-performance sparse matrix-vector multiplication on multicore platforms
Basaran et al.	2013	Grex: An efficient MapReduce framework for graphics processing units
Lazzaro et al.	2017	Increasing the efficiency of sparse matrix-matrix multiplication with a 2.5 D algorithm and one-sided MPI
Nechma et al.	2014	Parallel sparse matrix solution for circuit simulation on FPGAs
Lange et al.	2013	Achieving efficient strong scaling with PETSc using hybrid MPI/OpenMP optimisation
Blanco et al.	2018	Cstf: Large-scale sparse tensor factorizations on distributed platforms
Augonnet et al.	2019	A hierarchical fast direct solver for distributed memory machines with manycore nodes
Daley et al.	2012	Optimization of multigrid based elliptic solver for large scale simulations in the FLASH code
Kim et al.	2022	SnuHPL: high performance LINPACK for heterogeneous GPUs
Lu et al.	2023	Tilesptrsv: a tiled algorithm for parallel sparse triangular solve on gpus