[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Small et al., 2010 - Google Patents

Near-optimal Rendezvous protocols for RDMA-enabled clusters

Small et al., 2010

View PDF
Document ID
10334287003007040013
Author
Small M
Gu Z
Yuan X
Publication year
Publication venue
2010 39th International Conference on Parallel Processing

External Links

Snippet

Optimizing Message Passing Interface (MPI) point-to-point communication for large messages is of paramount importance since most communications in MPI applications are performed by such operations. Remote Direct Memory Access (RDMA) allows one-sided …
Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Programme initiating; Programme switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17356Indirect interconnection networks
    • G06F15/17368Indirect interconnection networks non hierarchical topologies
    • G06F15/17381Two dimensional, e.g. mesh, torus
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17337Direct connection machines, e.g. completely connected computers, point to point communication networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogramme communication; Intertask communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformations of program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/38Indexing scheme relating to groups G06F7/38 - G06F7/575
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements

Similar Documents

Publication Publication Date Title
Kumar et al. The deep computing messaging framework: generalized scalable message passing on the Blue Gene/P supercomputer
Shah et al. Performance and experience with LAPI-a new high-performance communication library for the IBM RS/6000 SP
Ajima et al. Tofu interconnect 2: System-on-chip integration of high-performance interconnect
Pakin et al. Fast Messages: Efficient, portable communication for workstation clusters and MPPs
US8032892B2 (en) Message passing with a limited number of DMA byte counters
US8082424B2 (en) Determining when a set of compute nodes participating in a barrier operation on a parallel computer are ready to exit the barrier operation
US7788334B2 (en) Multiple node remote messaging
US8325633B2 (en) Remote direct memory access
Araki et al. User-space communication: A quantitative study
US7802025B2 (en) DMA engine for repeating communication patterns
US20130067206A1 (en) Endpoint-Based Parallel Data Processing In A Parallel Active Messaging Interface Of A Parallel Computer
Suresh et al. A novel framework for efficient offloading of communication operations to bluefield smartnics
Small et al. Near-optimal Rendezvous protocols for RDMA-enabled clusters
Shoemaker et al. Numesh: An architecture optimized for scheduled communication
Suresh et al. Network assisted non-contiguous transfers for GPU-aware MPI libraries
US8782164B2 (en) Implementing asyncronous collective operations in a multi-node processing system
Rashti et al. A speculative and adaptive MPI rendezvous protocol over RDMA-enabled interconnects
Wong et al. Push-Pull Messaging: a high-performance communication mechanism for commodity SMP clusters
Schneider et al. Kernel-based offload of collective operations–implementation, evaluation and lessons learned
Kee et al. An efficient implementation of the BSP programming library for VIA
Nunes et al. A profiler for a heterogeneous multi-core multi-FPGA system
Gu et al. Protocol customization for improving MPI performance on RDMA-enabled clusters
Roweth et al. Optimised global reduction on QsNet/sup II
Mohamed et al. High-performance message striping over reliable transport protocols
Peryshkova et al. Analysis of All-to-all Collective Operations on Hierarchical Computer Clusters