Buluç et al., 2009 - Google Patents
Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocksBuluç et al., 2009
View PDF- Document ID
- 11040762343594802863
- Author
- Buluç A
- Fineman J
- Frigo M
- Gilbert J
- Leiserson C
- Publication year
- Publication venue
- Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
External Links
Snippet
This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A, x to be computed efficiently in parallel, where A is an n× n sparse matrix with nnz en nonzeros and x is a dense n-vector. Our algorithms use Θ …
- 239000011159 matrix material 0 abstract description 93
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30442—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Programme synchronisation; Mutual exclusion, e.g. by means of semaphores; Contention for resources among tasks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Buluç et al. | Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks | |
Hou et al. | Fast segmented sort on gpus | |
Demmel et al. | Communication-optimal parallel and sequential QR and LU factorizations | |
Boman et al. | The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: Partitioning, ordering and coloring | |
Buluç et al. | Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication | |
Kane et al. | Scalable strategies for computing with massive data | |
Boman et al. | Scalable matrix computations on large scale-free graphs using 2D graph partitioning | |
Solomonik et al. | Cyclops tensor framework: Reducing communication and eliminating load imbalance in massively parallel contractions | |
Merrill et al. | Scalable GPU graph traversal | |
Satish et al. | Navigating the maze of graph analytics frameworks using massive graph datasets | |
Yang et al. | A peta-scalable CPU-GPU algorithm for global atmospheric simulations | |
Yzelman et al. | High-level strategies for parallel shared-memory sparse matrix-vector multiplication | |
Sarıyüce et al. | Regularizing graph centrality computations | |
Barnard | PMRSB: Parallel multilevel recursive spectral bisection | |
Ballard et al. | Communication-optimal parallel and sequential Cholesky decomposition | |
Yeralan et al. | Algorithm 980: Sparse QR factorization on the GPU | |
Elafrou et al. | SparseX: A library for high-performance sparse matrix-vector multiplication on multicore platforms | |
Basaran et al. | Grex: An efficient MapReduce framework for graphics processing units | |
Lazzaro et al. | Increasing the efficiency of sparse matrix-matrix multiplication with a 2.5 D algorithm and one-sided MPI | |
Nechma et al. | Parallel sparse matrix solution for circuit simulation on FPGAs | |
Lange et al. | Achieving efficient strong scaling with PETSc using hybrid MPI/OpenMP optimisation | |
Blanco et al. | Cstf: Large-scale sparse tensor factorizations on distributed platforms | |
Fialko | Parallel direct solver for solving systems of linear equations resulting from finite element method on multi-core desktops and workstations | |
Augonnet et al. | A hierarchical fast direct solver for distributed memory machines with manycore nodes | |
Daley et al. | Optimization of multigrid based elliptic solver for large scale simulations in the FLASH code |