[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Buluç et al., 2009 - Google Patents

Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

Buluç et al., 2009

View PDF
Document ID
11040762343594802863
Author
Buluç A
Fineman J
Frigo M
Gilbert J
Leiserson C
Publication year
Publication venue
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures

External Links

Snippet

This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A, x to be computed efficiently in parallel, where A is an n× n sparse matrix with nnz en nonzeros and x is a dense n-vector. Our algorithms use Θ …
Continue reading at crd.lbl.gov (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • G06F17/30386Retrieval requests
    • G06F17/30424Query processing
    • G06F17/30442Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Programme synchronisation; Mutual exclusion, e.g. by means of semaphores; Contention for resources among tasks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30943Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
    • G06F17/30946Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5009Computer-aided design using simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application

Similar Documents

Publication Publication Date Title
Buluç et al. Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks
Hou et al. Fast segmented sort on gpus
Demmel et al. Communication-optimal parallel and sequential QR and LU factorizations
Boman et al. The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: Partitioning, ordering and coloring
Buluç et al. Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication
Kane et al. Scalable strategies for computing with massive data
Boman et al. Scalable matrix computations on large scale-free graphs using 2D graph partitioning
Solomonik et al. Cyclops tensor framework: Reducing communication and eliminating load imbalance in massively parallel contractions
Merrill et al. Scalable GPU graph traversal
Satish et al. Navigating the maze of graph analytics frameworks using massive graph datasets
Yang et al. A peta-scalable CPU-GPU algorithm for global atmospheric simulations
Yzelman et al. High-level strategies for parallel shared-memory sparse matrix-vector multiplication
Sarıyüce et al. Regularizing graph centrality computations
Barnard PMRSB: Parallel multilevel recursive spectral bisection
Ballard et al. Communication-optimal parallel and sequential Cholesky decomposition
Yeralan et al. Algorithm 980: Sparse QR factorization on the GPU
Elafrou et al. SparseX: A library for high-performance sparse matrix-vector multiplication on multicore platforms
Basaran et al. Grex: An efficient MapReduce framework for graphics processing units
Lazzaro et al. Increasing the efficiency of sparse matrix-matrix multiplication with a 2.5 D algorithm and one-sided MPI
Nechma et al. Parallel sparse matrix solution for circuit simulation on FPGAs
Lange et al. Achieving efficient strong scaling with PETSc using hybrid MPI/OpenMP optimisation
Blanco et al. Cstf: Large-scale sparse tensor factorizations on distributed platforms
Fialko Parallel direct solver for solving systems of linear equations resulting from finite element method on multi-core desktops and workstations
Augonnet et al. A hierarchical fast direct solver for distributed memory machines with manycore nodes
Daley et al. Optimization of multigrid based elliptic solver for large scale simulations in the FLASH code