[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

Integration of message passing and shared memory in the Stanford FLASH multiprocessor

Published: 01 November 1994 Publication History

Abstract

The advantages of using message passing over shared memory for certain types of communication and synchronization have provided an incentive to integrate both models within a single architecture. A key goal of the FLASH (FLexible Architecture for SHared memory) project at Stanford is to achieve this integration while maintaining a simple and efficient design. This paper presents the hardware and software mechanisms in FLASH to support various message passing protocols. We achieve low overhead message passing by delegating protocol functionality to the programmable node controllers in FLASH and by providing direct user-level access to this messaging subsystem. In contrast to most earlier work, we provide an integrated solution that handles the interaction of the messaging protocols with virtual memory, protected multiprogramming, and cache coherence. Detailed simulation studies indicate that this system can sustain message-transfers rates of several hundred megabytes per second, effectively utilizing projected network bandwidths for next generation multiprocessors.

References

[1]
Anant Agarwal, David Chaiken, Godfrey D'Souza, Kirk Johnson, David Kranz, John Kubiatowicz, Kiyoshi Kurihara, Beng-Hong Lim, Gino Maa, Dan Nussbaum, Mike Parkin, and Donald Yeung. The MIT Alewife machine: A large scale distributed-memory multiprocessor. In Proceedings of the Workshop on Scalable Shared Memory Multiprocessors. Kluwer Academic Publishers, 1991. This paper also appears as MIT/LCS Memo TM-454, 1991.
[2]
Michael J. Beckerle. An overview of the START(*T) computer system. Motorola Technical Report MCRC-TR-28, Motorola, Inc., One Kendall Square, Building 200, Cambridge, MA 02139, July 1992.
[3]
Matthias Blumrich, Kai Li, Richard Alpert, Cezary Dubnicki, Edward Felten, and Jonathan Sandberg. Virtual memory mapped network interface for the SHRIMP multicomputer. In Proceedings of the 21 st International Symposium on Computer Architecture, pages 142-153, April 1994.
[4]
Cray Research, Inc. Cray T3D System Architecture, 1993.
[5]
W. Dally, J. Fiske, J. Keen, R. Lethin, M. Noakes, P. Nuth, R. Davison, and G. Fyler. The message-driven processor: A multicomputer processing node with efficient mechanisms. IEEE Mtcro, 12(2):23- 39, 1992.
[6]
Stephen Goldschmidt. Simulatton of Muttiprocessors: Accuracy and Performance. PhD thesis, Stanford University, June 1993.
[7]
John Heinlein, Kourosh Gharachorloo, and Anoop Gupta. Integrating multiple communication paradigms in high performance multiprocessors. Technical Report CSL-TR-94-604, Stanford University, Computer Systems Laboratory, February 1994.
[8]
Mark Heinrich, Jeffrey Kuskin, David Ofelt, John Heinlein, Joel Baxter, Jaswinder Pal Singh, Richard Simoni, Kourosh Gharachorloo, David Nakahira, Mark Horowitz, Anoop Gupta, Mendel Rosenblum, and John Hennessy. The performance impact of flexibihty in the Stanford FLASH Multlprocessor. in Proceedings of the Stxth International Conference on Architectural Support for Programming Languages and Operating Systems, October 1994.
[9]
Dana S. Henry and Christopher F. Joerg. A tightly coupled processor-network interface. In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 111-122, September 1992.
[10]
Mark Homewood and Moray McLaren. Meiko CS-2 interconnect Elan-Elite design. In Proceedings of Hot Interconnects 93, August 1993.
[11]
Intel Corporation. Paragon XP/S Product Overview, 1991.
[12]
David Kranz, Kirk Johnson, Anant Agarwal, John Kubiatowicz, and Beng-Hong Lim. Integrating message passing and sharedmemory: Early experience. In Proceedings of the 4th A CM SIG- PLAN Symposium on Principles and Practices of Parallel Programming, pages 54-63, May 1993.
[13]
John Kubiatowicz and Anant Agarwal. Anatomy of a message in the Alewife multiprocessor. In Proceedings of the 7th A CM international Conference on Supercomputing, July 1993.
[14]
Jeffrey Kuskin, David Ofelt, Mark Heinrich, John Heinlein, Richard Simoni, Kourosh Gharachofioo, John Chapin, Davld Nakahira, Joel Baxter, Mark Horowitz, Anoop Gupta, Mendel Rosenblum, and John Hennessy. The Stanford FLASH Muluprocessor, in Proceedings of the 21st international Symposium on Computer Architecture, pages 302-313, April 1994.
[15]
Message Passing Interface Forum. Document for a standard message-passing interface. Technical Report No. CS-93-214, University of Tennessee, November 1993.
[16]
Rishiyur Nikhil, Gregory M. Papadopoulos, and Arvind. *T: A multlthreaded massively parallel architecture. In Proceedings of the I 9th international Symposium on Computer Archttecture, pages 156-167, May 1992.
[17]
Paul Pierce, Intel Supercomputer Systems Division, November 1993. Personal Communication.
[18]
Paul Pierce. The NX/2 operating system. In G. Fox, editor, Proceedings of the Third Conference on Hypercube Concurrent Computers and AppIicattons, volume 1 of 2, pages 384-390, 1988.
[19]
Steven K. Reinhardt, James R. Larus, and David A. Wood. Tempest and Typhoon: User-level shared memory. In Proceedings of the 21st International Symposium on Computer Architecture, pages 325-336, April 1994.
[20]
Michael David Smith. Support for Speculative Execution in High- Performance Processors. PhD thesis, Stanford University, November 1992. Tech. Report CSL-TR-93-556.
[21]
Thmking Machines Corporation. The Connection Machine CM-5 Technical Summary, 1991.
[22]
Thinking Machines Corporation. Programming the NI, March 1992.
[23]
Thorsten von Emken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. Active messages: A mechanism for integrated communication and computation. In Proceedings of the I9th International Symposium on Computer Archttecture, pages 256-266, May 1992.

Cited By

View all
  • (2019)Prototyping a Configurable Cache/Scratchpad Memory with Virtualized User-Level RDMA CapabilityTransactions on High-Performance Embedded Architectures and Compilers V10.1007/978-3-662-58834-5_6(100-120)Online publication date: 23-Feb-2019
  • (2013)NP-SARCJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2012.11.00159:1(39-47)Online publication date: 1-Jan-2013
  • (2011)BibliographyDesigning Network On-Chip Architectures in the Nanoscale Era10.1201/b10477-18(443-475)Online publication date: 9-Feb-2011
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 28, Issue 5
Dec. 1994
323 pages
ISSN:0163-5980
DOI:10.1145/381792
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS VI: Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
    November 1994
    341 pages
    ISBN:0897916603
    DOI:10.1145/195473
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1994
Published in SIGOPS Volume 28, Issue 5

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)197
  • Downloads (Last 6 weeks)24
Reflects downloads up to 18 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Prototyping a Configurable Cache/Scratchpad Memory with Virtualized User-Level RDMA CapabilityTransactions on High-Performance Embedded Architectures and Compilers V10.1007/978-3-662-58834-5_6(100-120)Online publication date: 23-Feb-2019
  • (2013)NP-SARCJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2012.11.00159:1(39-47)Online publication date: 1-Jan-2013
  • (2011)BibliographyDesigning Network On-Chip Architectures in the Nanoscale Era10.1201/b10477-18(443-475)Online publication date: 9-Feb-2011
  • (2010)Network Processing in Multi-core FPGAs with Integrated Cache-Network InterfaceProceedings of the 2010 International Conference on Reconfigurable Computing and FPGAs10.1109/ReConFig.2010.51(328-333)Online publication date: 13-Dec-2010
  • (2009)FPGA implementation of a configurable cache/scratchpad memory with virtualized user-level RDMA capability2009 International Symposium on Systems, Architectures, Modeling, and Simulation10.1109/ICSAMOS.2009.5289226(149-156)Online publication date: Jul-2009
  • (2008)Comparative evaluation of memory models for chip multiprocessorsACM Transactions on Architecture and Code Optimization10.1145/1455650.14556515:3(1-30)Online publication date: 1-Dec-2008
  • (1999)Programming Effort vs. Performance with a Hybrid Programming Model for Distributed Memory Parallel ArchitecturesEuro-Par’99 Parallel Processing10.1007/3-540-48311-X_124(888-898)Online publication date: 6-Aug-1999
  • (1995)HiveACM SIGOPS Operating Systems Review10.1145/224057.22405929:5(12-25)Online publication date: 3-Dec-1995
  • (2016)Extended task queuingProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3014904.3015012(1-12)Online publication date: 13-Nov-2016
  • (2016)M3ACM SIGARCH Computer Architecture News10.1145/2980024.287237144:2(189-203)Online publication date: 25-Mar-2016
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media