More Web Proxy on the site http://driver.im/

research-article

Network endpoint congestion control for fine-grained communication

Authors:

Larry Dennison,

William J. DallyAuthors Info & Claims

SC '15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Article No.: 35, Pages 1 - 12

https://doi.org/10.1145/2807591.2807600

Published: 15 November 2015 Publication History

Abstract

Endpoint congestion in HPC networks creates tree saturation that is detrimental to performance. Endpoint congestion can be alleviated by reducing the injection rate of traffic sources, but requires fast reaction time to avoid congestion buildup. Congestion control becomes more challenging as application communication shift from traditional two-sided model to potentially fine-grained, one-sided communication embodied by various global address space programming models. Existing hardware solutions, such as Explicit Congestion Notification (ECN) and Speculative Reservation Protocol (SRP), either react too slowly or incur too much overhead for small messages.

In this study we present two new endpoint congestion-control protocols, Small-Message SRP (SMSRP) and Last-Hop Reservation Protocol (LHRP), both targeted specifically for small messages. Experiments show they can quickly respond to endpoint congestion and prevent tree saturation in the network. Under congestion-free traffic conditions, the new protocols generate minimal overhead with performance comparable to networks with no endpoint congestion control.

References

[1]

Infiniband trade association, infiniband architecture specification, volume 1, release 1.2.1, http://www.infinibandta.com.

[2]

M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, R. Pan, B. Prabhakar, and M. Seaman. Data center transport mechanisms: Congestion control theory and ieee standardization. In Communication, Control, and Computing, 2008 46th Annual Allerton Conference on, 2008.

[3]

M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center tcp (dctcp). ACM SIGCOMM computer communication review, 41(4).

Digital Library

[4]

R. Alverson, D. Roweth, and L. Kaplan. The gemini system interconnect. In Proceedings of the 2010 18th IEEE Symposium on High Performance Interconnects, HOTI '10.

Digital Library

[5]

A. Bhatele, K. Mohror, S. H. Langer, and K. E. Isaacs. There goes the neighborhood: Performance degradation due to nearby jobs. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '13, 2013.

Digital Library

[6]

B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. International Journal of High Performance Computing Applications, 21(3).

Digital Library

[7]

B. Chapman, T. Curtis, S. Pophale, S. Poole, J. Kuehn, C. Koelbel, and L. Smith. Introducing openshmem: Shmem for the pgas community. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model.

Digital Library

[8]

D. Chen, N. A. Eisley, P. Heidelberger, R. M. Senger, Y. Sugawara, S. Kumar, V. Salapura, D. Satterfield, B. Steinmacher-Burow, and J. Parker. The ibm blue gene/q interconnection fabric. IEEE Micro, 32(1), 2012.

Digital Library

[9]

S.-T. Chuang, A. Goel, N. McKeown, and B. Prabhakar. Matching output queueing with a combined input/output-queued switch. Selected Areas in Communications, IEEE Journal on, 17(6).

Digital Library

[10]

U. Consortium et al. Upc language specifications v1. 2. Lawrence Berkeley National Laboratory, 2005.

[11]

W. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2003.

Digital Library

[12]

W. J. Dally. Virtual-Channel Flow Control. IEEE Transactions on Parallel and Distributed Systems, 3(2), 1992.

Digital Library

[13]

J. Dinan, P. Balaji, D. Buntinas, D. Goodell, W. Gropp, and R. Thakur. An implementation and evaluation of the mpi 3.0 one-sided communication interface. Concurrency and Computation: Practice and Experience, 2013.

[14]

J. Duato, I. Johnson, J. Flich, F. Naven, P. Garcia, and T. Nachiondo. A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks. In High-Performance Computer Architecture. 11th International Symposium on, 2005.

Digital Library

[15]

J. Escudero-Sahuquillo, P. García, F. Quiles, J. Flich, and J. Duato. Fbicm: Efficient congestion management for high-performance networks using distributed deterministic routing. In Proceedings of the 15th International Conference on High Performance Computing, HiPC'08, 2008.

Digital Library

[16]

J. Escudero-Sahuquillo, E. G. Gran, P. J. Garcia, J. Flich, T. Skeie, O. Lysne, F. J. Quiles, and J. Duato. Combining congested-flow isolation and injection throttling in hpc interconnection networks. In Proceedings of the 2011 International Conference on Parallel Processing, ICPP '11, 2011.

Digital Library

[17]

G. Faanes, A. Bataineh, D. Roweth, T. Court, E. Froese, B. Alverson, T. Johnson, J. Kopnick, M. Higgins, and J. Reinhard. Cray cascade: a scalable hpc system based on a dragonfly network. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, 2012.

Digital Library

[18]

J.-L. Ferrer, E. Baydal, A. Robles, P. Lopez, and J. Duato. Congestion management in mins through marked and validated packets. In Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, 2007.

Digital Library

[19]

J.-L. Ferrer, E. Baydal, A. Robles, P. Lopez, and J. Duato. A scalable and early congestion management mechanism for mins. In Parallel, Distributed and Network-Based Processing, 18th Euromicro International Conference on, 2010.

Digital Library

[20]

M. Garcia, E. Vallejo, R. Beivide, M. Odriozola, and M. Valero. Efficient routing mechanisms for dragonfly networks. In Parallel Processing (ICPP), 2013 42nd International Conference on, Oct 2013.

Digital Library

[21]

E. Gran, M. Eimot, S.-A. Reinemo, T. Skeie, O. Lysne, L. Huse, and G. Shainer. First experiences with congestion control in infiniband hardware. In Parallel Distributed Processing, 2010 IEEE International Symposium on.

[22]

N. Jiang, D. Becker, G. Michelogiannakis, J. Balfour, B. Towles, D. Shaw, J. Kim, and W. Dally. A detailed and flexible cycle-accurate network-on-chip simulator. In Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on, April 2013.

[23]

N. Jiang, D. U. Becker, G. Michelogiannakis, and W. J. Dally. Network congestion avoidance through speculative reservation. In Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture, HPCA '12, 2012.

Digital Library

[24]

N. Jiang, J. Kim, and W. J. Dally. Indirect adaptive routing on large scale interconnection networks. SIGARCH Comput. Archit. News, 37(3), June 2009.

Digital Library

[25]

J. Kim, W. J. Dally, S. Scott, and D. Abts. Technology-driven, highly-scalable dragonfly network. Beijing, China, 2008.

[26]

M. Luo, D. K. Panda, K. Z. Ibrahim, and C. Iancu. Congestion avoidance on manycore high performance computing systems. In Proceedings of the 26th ACM International Conference on Supercomputing, ICS '12, 2012.

Digital Library

[27]

G. Michelogiannakis, N. Jiang, D. Becker, and W. J. Dally. Channel reservation protocol for over-subscribed channels and destinations. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '13, 2013.

Digital Library

[28]

L. Oden and H. Froning. Ggas: Global gpu address spaces for efficient communication in heterogeneous clusters. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, Sept 2013.

[29]

G. Pfister, M. Gusat, W. Denzel, D. Craddock, N. Ni, W. Rooney, T. Engbersen, R. Luijten, R. Krishnamurthy, and J. Duato. Solving hot spot contention using infiniband architecture congestion control. In High Performance Interconnects for Distributed Computing, 2005.

[30]

G. Pfister and V. A. Norton. Hot spot contention and combining in multistage interconnection network. IEEE Trans. on Computers, C-34, October 1985.

[31]

S. Potluri. Toc-centric communication: A case study with nvshmem, 10 2014.

[32]

B. Prabhakar and N. McKeown. On the speedup required for combined input-and output-queued switching. Automatica, 35(12).

Digital Library

[33]

P. Sack and W. Gropp. Faster topology-aware collective algorithms through non-minimal communication. In Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '12, 2012.

Digital Library

[34]

S. Scott, D. Abts, J. Kim, and W. J. Dally. The blackwidow high-radix clos network. In Proceedings of the 33rd annual international symposium on Computer Architecture, 2006.

Digital Library

[35]

Y. Zheng, A. Kamil, M. Driscoll, H. Shan, and K. Yelick. Upc++: A pgas extension for c++. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, May 2014.

Digital Library

Cited By

Wu KDong DXu W(2024)COER: A Network Interface Offloading Architecture for RDMA and Congestion Control Protocol CodesignACM Transactions on Architecture and Code Optimization10.1145/366052521:3(1-26)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3660525
Zhuang HChang JLi XSong FLin Q(2023)All-to-All Broadcast Algorithm in Galaxyfly NetworksMathematics10.3390/math1111245911:11(2459)Online publication date: 26-May-2023
https://doi.org/10.3390/math11112459
Muthukrishnan HLustig DVilla OWenisch TNellans D(2023)FinePack: Transparently Improving the Efficiency of Fine-Grained Transfers in Multi-GPU Systems2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10070949(516-529)Online publication date: Feb-2023
https://doi.org/10.1109/HPCA56546.2023.10070949
Show More Cited By

Index Terms

Network endpoint congestion control for fine-grained communication
1. Networks
  1. Network architectures
  2. Network protocols

Recommendations

TCP-friendly Congestion Control for HighSpeed Network
SAINT '07: Proceedings of the 2007 International Symposium on Applications and the Internet

The currently used TCP congestion control, TCP Reno, has two weaknesses. To solve this TCP Reno drawback, HighSpeed TCP and Scalable TCP were proposed. However, the fairness between these proposed TCP and TCP Reno is not considered, when both ...
Network congestion control
Configurable active multicast congestion control

A multicast congestion control and avoidance scheme is indispensable for group-based applications to fairly share and efficiently use network resources with unicast applications and maintain the stability of the Internet. It is difficult for the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SC '15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

November 2015

985 pages

ISBN:9781450337236

DOI:10.1145/2807591

General Chair:
Jackie Kern
University of Illinois at Urbana-Champaign, Urbana, Illinois
,
Program Chair:
Jeffrey S. Vetter
Oak Ridge National Laboratory and Georgia Institute of Technology, Oak Ridge, Tennessee

Copyright © 2015 ACM.

© 2015 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

SC15

Sponsor:

SIGHPC
SIGARCH
IEEE-CS

SC15: The International Conference for High Performance Computing, Networking, Storage and Analysis

November 15 - 20, 2015

Texas, Austin

Acceptance Rates

SC '15 Paper Acceptance Rate 79 of 358 submissions, 22%;

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
568
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)4

Reflects downloads up to 10 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu KDong DXu W(2024)COER: A Network Interface Offloading Architecture for RDMA and Congestion Control Protocol CodesignACM Transactions on Architecture and Code Optimization10.1145/366052521:3(1-26)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3660525
Zhuang HChang JLi XSong FLin Q(2023)All-to-All Broadcast Algorithm in Galaxyfly NetworksMathematics10.3390/math1111245911:11(2459)Online publication date: 26-May-2023
https://doi.org/10.3390/math11112459
Muthukrishnan HLustig DVilla OWenisch TNellans D(2023)FinePack: Transparently Improving the Efficiency of Fine-Grained Transfers in Multi-GPU Systems2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10070949(516-529)Online publication date: Feb-2023
https://doi.org/10.1109/HPCA56546.2023.10070949
Huang SDong DZeng LZhou ZZhou YLiao X(2022)DC4: Reconstructing Data-Credit-Coupled Congestion Control for Data CentersProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3545023(1-11)Online publication date: 29-Aug-2022
https://dl.acm.org/doi/10.1145/3545008.3545023
Li CDong DLiao X(2022)MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining RoutersACM Transactions on Architecture and Code Optimization10.1145/351902719:3(1-23)Online publication date: 4-May-2022
https://dl.acm.org/doi/10.1145/3519027
Snyder JLebeck A(2022)Fast Convergence to Fairness for Reduced Long Flow Tail Latency in Datacenter Networks2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00102(1007-1017)Online publication date: May-2022
https://doi.org/10.1109/IPDPS53621.2022.00102
Huang SDong DZhou ZShi HYang WLiao X(2022)FastCredit: Expediting credit-based congestion control in datacentersComputer Networks10.1016/j.comnet.2022.109126214(109126)Online publication date: Sep-2022
https://doi.org/10.1016/j.comnet.2022.109126
Wu KDong DLi CXu W(2022)Revisiting network congestion avoidance through adaptive packet-chaining reservationComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2022.109008212:COnline publication date: 20-Jul-2022
https://dl.acm.org/doi/10.1016/j.comnet.2022.109008
Zhang YQian KRen F(2021)Receiver-Driven Congestion Control for InfiniBandProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472466(1-10)Online publication date: 9-Aug-2021
https://dl.acm.org/doi/10.1145/3472456.3472466
Patke AJha SQiu HBrandt JGentile AGreenseid JKalbarczyk ZIyer RZhou HMoreira JMueller FEtsion Y(2021)Delay sensitivity-driven congestion mitigation for HPC systemsProceedings of the 35th ACM International Conference on Supercomputing10.1145/3447818.3460362(342-353)Online publication date: 3-Jun-2021
https://dl.acm.org/doi/10.1145/3447818.3460362
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents