[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory MultiprocessorsJanuary 1995
1995 Technical Report
Publisher:
  • Stanford University
  • 408 Panama Mall, Suite 217
  • Stanford
  • CA
  • United States
Published:01 January 1995
Reflects downloads up to 01 Jan 2025Bibliometrics
Skip Abstract Section
Abstract

Distributed shared memory (DSM) machines can be characterized by four parameters, based on a slightly modified version of the logP model. The l (latency) and o (occupancy of the communication controller) parameters are the keys to performance in these machines, and are largely determined by major architectural decisions about the aggressiveness and customization of the node and network. For recent and upcoming machines, the g (gap) parameter that measures node-to-network bandwidth does not appear to be a bottleneck. Conventional wisdom is that latency is the dominant factor in determining the performance of a DSM machine. We show, however, that controller occupancy--which causes contention even in highly optimized applications--plays a major role, especially at low latencies. When latency hiding is used, occupancy becomes more critical, even in machines with high latency networks. Scaling the problem size is often used as a technique to overcome limitations in communication latency and bandwidth. We show that in many structured computations occupancy-induced contention is not alleviated by increasing problem size, and that there are important classes of applications for which the performance lost by using higher latency networks or higher occupancy controllers cannot be regained easily, if at all, by scaling the problem size.

Cited By

  1. Zhang Z and Seidel S A performance model for fine-grain accesses in UPC Proceedings of the 20th international conference on Parallel and distributed processing, (65-65)
  2. Falsafi B and Wood D (2005). Evaluating scheduling policies for fine-grain communication protocols on a cluster of SMPs, Journal of Parallel and Distributed Computing, 65:4, (464-478), Online publication date: 1-Apr-2005.
  3. Chaudhuri M, Heinrich M, Holt C, Singh J, Rothberg E and Hennessy J (2003). Latency, Occupancy, and Bandwidth in DSM Multiprocessors, IEEE Transactions on Computers, 52:7, (862-880), Online publication date: 1-Jul-2003.
  4. Hsiao H and King C (2019). An Application-Driven Study of Multicast Communication for Write Invalidation, The Journal of Supercomputing, 18:3, (279-304), Online publication date: 1-Mar-2001.
  5. Moritz C and Frank M (2001). LoGPC, IEEE Transactions on Parallel and Distributed Systems, 12:4, (404-415), Online publication date: 1-Apr-2001.
  6. Hoisie A, Lubeck O and Wasserman H (2000). Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications, International Journal of High Performance Computing Applications, 14:4, (330-346), Online publication date: 1-Nov-2000.
  7. Heinrich M, Soundararajan V, Hennessy J and Gupta A (1999). A Quantitative Analysis of the Performance and Scalability of Distributed Shared Memory Cache Coherence Protocols, IEEE Transactions on Computers, 48:2, (205-217), Online publication date: 1-Feb-1999.
  8. Michael M, Nanda A and Lim B (1999). Coherence Controller Architectures for Scalable Shared-Memory Multiprocessors, IEEE Transactions on Computers, 48:2, (245-255), Online publication date: 1-Feb-1999.
  9. Hwang K, Wang C, Wang C and Xu Z (1999). Resource Scaling Effects on MPP Performance, IEEE Transactions on Parallel and Distributed Systems, 10:5, (509-527), Online publication date: 1-May-1999.
  10. ACM
    Sundaram-Stukel D and Vernon M Predictive analysis of a wavefront application using LogGP Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, (141-150)
  11. ACM
    Sundaram-Stukel D and Vernon M (1999). Predictive analysis of a wavefront application using LogGP, ACM SIGPLAN Notices, 34:8, (141-150), Online publication date: 1-Aug-1999.
  12. ACM
    Bilas A, Iftode L and Singh J Evaluation of hardware write propagation support for next-generation shared virtual memory clusters Proceedings of the 12th international conference on Supercomputing, (274-281)
  13. ACM
    Moritz C and Frank M LoGPC Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, (254-263)
  14. ACM
    Moritz C and Frank M (1998). LoGPC, ACM SIGMETRICS Performance Evaluation Review, 26:1, (254-263), Online publication date: 1-Jun-1998.
  15. Qin X and Baer J Optimizing software cache-coherent cluster architectures Proceedings of the 1998 ACM/IEEE conference on Supercomputing, (1-14)
  16. ACM
    Qin X and Baer J A performance evaluation of cluster architectures Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, (237-247)
  17. ACM
    Qin X and Baer J (1997). A performance evaluation of cluster architectures, ACM SIGMETRICS Performance Evaluation Review, 25:1, (237-247), Online publication date: 1-Jun-1997.
  18. ACM
    Frank M, Agarwal A and Vernon M LoPC Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming, (276-287)
  19. ACM
    Frank M, Agarwal A and Vernon M (1997). LoPC, ACM SIGPLAN Notices, 32:7, (276-287), Online publication date: 1-Jul-1997.
  20. ACM
    Martin R, Vahdat A, Culler D and Anderson T Effects of communication latency, overhead, and bandwidth in a cluster architecture Proceedings of the 24th annual international symposium on Computer architecture, (85-97)
  21. ACM
    Michael M, Nanda A, Lim B and Scott M Coherence controller architectures for SMP-based CC-NUMA multiprocessors Proceedings of the 24th annual international symposium on Computer architecture, (219-228)
  22. ACM
    Martin R, Vahdat A, Culler D and Anderson T (1997). Effects of communication latency, overhead, and bandwidth in a cluster architecture, ACM SIGARCH Computer Architecture News, 25:2, (85-97), Online publication date: 1-May-1997.
  23. ACM
    Michael M, Nanda A, Lim B and Scott M (1997). Coherence controller architectures for SMP-based CC-NUMA multiprocessors, ACM SIGARCH Computer Architecture News, 25:2, (219-228), Online publication date: 1-May-1997.
  24. ACM
    Bilas A and Singh J The effects of communication parameters on end performance of shared virtual memory clusters Proceedings of the 1997 ACM/IEEE conference on Supercomputing, (1-35)
  25. Moga A, Dubois M and Gefflaut A Hardware Versus Software Implementation of COMA Proceedings of the international Conference on Parallel Processing, (248-256)
  26. ACM
    Iftode L, Singh J and Li K Understanding application performance on shared virtual memory systems Proceedings of the 23rd annual international symposium on Computer architecture, (122-133)
  27. ACM
    Holt C, Singh J and Hennessy J Application and architectural bottlenecks in large scale distributed shared memory machines Proceedings of the 23rd annual international symposium on Computer architecture, (134-145)
  28. ACM
    Iftode L, Singh J and Li K (1996). Understanding application performance on shared virtual memory systems, ACM SIGARCH Computer Architecture News, 24:2, (122-133), Online publication date: 1-May-1996.
  29. ACM
    Holt C, Singh J and Hennessy J (1996). Application and architectural bottlenecks in large scale distributed shared memory machines, ACM SIGARCH Computer Architecture News, 24:2, (134-145), Online publication date: 1-May-1996.
  30. ACM
    Woo S, Ohara M, Torrie E, Singh J and Gupta A The SPLASH-2 programs Proceedings of the 22nd annual international symposium on Computer architecture, (24-36)
  31. ACM
    Woo S, Ohara M, Torrie E, Singh J and Gupta A (1995). The SPLASH-2 programs, ACM SIGARCH Computer Architecture News, 23:2, (24-36), Online publication date: 1-May-1995.
Contributors
  • Stanford University
  • University of Central Florida
  • Princeton University
  • Hewlett Packard Enterprise
  • Stanford University
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations