[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Huang et al., 2024 - Google Patents

An optimized error-controlled mpi collective framework integrated with lossy compression

Huang et al., 2024

View PDF
Document ID
323249764786909653
Author
Huang J
Di S
Yu X
Zhai Y
Zhang Z
Liu J
Lu X
Raffenetti K
Zhou H
Zhao K
Chen Z
Cappello F
Guo Y
Thakur R
Publication year
Publication venue
2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

External Links

Snippet

With the ever-increasing computing power of supercomputers and the growing scale of scientific applications, the efficiency of MPI collective communications turns out to be a critical bottleneck in large-scale distributed and parallel processing. The large message size …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17306Intercommunication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogramme communication; Intertask communication
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • G06F17/30129Details of further file system functionalities
    • G06F17/3015Redundancy elimination performed by the file system
    • G06F17/30153Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled

Similar Documents

Publication Publication Date Title
Moreland et al. An image compositing solution at scale
Di et al. Fast error-bounded lossy HPC data compression with SZ
Ma et al. Garaph: Efficient {GPU-accelerated} graph processing on a single machine with balanced replication
Zou et al. FlexAnalytics: a flexible data analytics framework for big data applications with I/O performance improvement
US11436065B2 (en) System for efficient large-scale data distribution in distributed and parallel processing environment
Zhou et al. Designing high-performance mpi libraries with on-the-fly compression for modern gpu clusters
Knecht et al. Large-scale parallel configuration interaction. II. Two-and four-component double-group general active space implementation with application to BiH
Peterka et al. A configurable algorithm for parallel image-compositing applications
Huang et al. An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression
Yu et al. Ultrafast error-bounded lossy compression for scientific datasets
JPWO2014061481A1 (en) Data transfer apparatus and data transfer system using adaptive compression algorithm
Cheng et al. HAFLO: GPU-based acceleration for federated logistic regression
Al Sideiri et al. CUDA implementation of fractal image compression
US20190281316A1 (en) High efficiency video coding method and apparatus, and computer-readable storage medium
CN117435855A (en) Method for performing convolution operation, electronic device, and storage medium
Barrett et al. Reducing the bulk in the bulk synchronous parallel model
Zhou et al. Accelerating broadcast communication with gpu compression for deep learning workloads
Wu et al. Memory-efficient quantum circuit simulation by using lossy data compression
Xu et al. Scaling up data-parallel analytics platforms: Linear algebraic operation cases
US10210136B2 (en) Parallel computer and FFT operation method
Huang et al. POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU Clusters
Markov et al. CGX: adaptive system support for communication-efficient deep learning
Suresh et al. Network assisted non-contiguous transfers for GPU-aware MPI libraries
KR20220142059A (en) In-memory Decoding Cache and Its Management Scheme for Accelerating Deep Learning Batching Process
Koyama et al. Scalable data parallel distributed training for graph neural networks