[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
survey

Interconnection Networks in Petascale Computer Systems: A Survey

Published: 16 September 2016 Publication History

Abstract

This article provides background information about interconnection networks, an analysis of previous developments, and an overview of the state of the art. The main contribution of this article is to highlight the importance of the interpolation and extrapolation of technological changes and physical constraints in order to predict the optimum future interconnection network. The technological changes are related to three of the most important attributes of interconnection networks: topology, routing, and flow-control algorithms. On the other hand, the physical constraints, that is, port counts, number of communication nodes, and communication speed, determine the realistic properties of the network. We present the state-of-the-art technology for the most commonly used interconnection networks and some background related to often-used network topologies. The interconnection networks of the best-performing petascale parallel computers from past and present Top500 lists are analyzed. The lessons learned from this analysis indicate that computer networks need better performance in future exascale computers. Such an approach leads to the conclusion that a high-radix topology with optical connections for longer links is set to become the optimum interconnect for a number of relevant application domains.

References

[1]
D. Abts, A. Bataineh, S. Scott, G. Faanes, J. Schwarzmeier, E. Lundberg, T. Johnson, M. Bye, and G. Schwoerer. 2007. The cray blackwidow: A highly scalable vector multiprocessor. In Proceedings of ACM/IEEE Conference on Supercomputing. 1--12.
[2]
Y. Ajima, S. Sumimoto, and T. Shimizu. 2009. Tofu: A 6D mesh/torus interconnect for exascale computers. Computer 42, 11, 36--40.
[3]
S. R. Alam, J. A. Kuehn, R. F. Barrett, J. M. Larkin, M. R. Fahey, R. Sankaran, and P. H. Worley. 2007. Cray XT4: An early evaluation for petascale scientific simulation. In Proceedings of ACM/IEEE Conference on Supercomputing, 1--12.
[4]
R. Alverson, D. Roweth, and L. Kaplan. 2010. The gemini system interconnect. In Proceedings of the IEEE 18th Annual Symposium on High Performance Interconnects. 83--87.
[5]
B. W. Arden and H. Lee. 1982. A regular network for multicomputer systems. IEEE Trans. Comput. C-31, 1, 60--69.
[6]
B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni, and R. Rajamony. 2010. The PERCS high-performance interconnect. In Proceedings of the 18th IEEE Symposium on High Performance Interconnects. 75--82.
[7]
K. J. Barker, K. Davis, A. Hoisie, D. J. Kerbyson, M. Lang, S. Pakin, and J. C. Sancho. 2008. Entering the petaflop era: The architecture and performance of roadrunner. International Conference for High Performance Computing, Networking, Storage and Analysis, 1--11.
[8]
A. F. Benner, M. Ignatowski, J. A. Kash, D. M. Kuchta, and M. B. Ritter. 2005. Exploitation of optical interconnects in future server architectures. IBM J. Res. Dev. 49, 4--5, 755--775.
[9]
R. Brightwell, W. Camp, B. Cole, E. DeBenedictis, R. Leland, J. Tomkins, and A. B. MacCabe. 2005a. Architectural specification for massively parallel computers: An experience and measurement-based approach: Research articles. Concurr. Comput. Pract. Exper. 17, 10, 1271--1316.
[10]
R. Brightwell, K. Pedretti, and K. D. Underwood. 2005b. Initial performance evaluation of the Cray SeaStar interconnect. Proceedings 13th Symposium on High Performance Interconnects, 51--57.
[11]
T. Buh, R. Trobec, and A. Ciglič. 2014. Adaptive network-traffic balancing on multi-core software networking devices. Comput. Netw. 69, 19--34.
[12]
J. M. Camara, M. Moreto, E. Vallejo, R. Beivide, J. Miguel-Alonso, C. Martinez, and J. Navaridas. 2010. Twisted torus topologies for enhanced interconnection networks. IEEE Trans. Parallel Distrib. Syst. 21, 12, 1765--1778.
[13]
P. Charles, C. Grothoff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. 2005. X10: An object-oriented approach to non-uniform cluster computing. SIGPLAN Not., 40, 10, 519--538.
[14]
C. Clos. 1953. A study of non-blocking switching networks. Bell Syst. Technol. J. 32, 406--424.
[15]
P. W. Coteus, J. U. Knickerbocker, C. H. Lam, and Y. A. Vlasov. 2011. Technologies for exascale systems. IBM J. Res. Dev. 55, 5, 581--592.
[16]
W. J. Dally and C. L. Seitz. 1987. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans. Comput. C-36, 5, 547--553.
[17]
W. J. Dally and B. Towles. 2004. Principles And Practices of Interconnection Networks. Morgan Kaufmann.
[18]
J. J. Dongarra and M. A. Heroux. 2013. Toward a New Metric for Ranking High Performance Computing Systems. Sandia National Laboratories.
[19]
J. J. Dongarra, P. Luszczek, and A. Petitet. 2003. The LINPACK benchmark: Past, present and future. Concurr. Comput. Pract. Exper. 15, 9, 803--820.
[20]
J. Duato, S. Yalamanchili, and L. Ni. 2002. Interconnection Networks. Morgan Kaufmann.
[21]
M. J. Flynn, O. Mencer, V. Milutinovic, G. Rakocevic, P. Stenstrom, R. Trobec, and M. Valero. 2013. Moving from petaflops to petadata. Commun. ACM, 56, 5, 39--42.
[22]
J. Friedman. 2008. New views of the structure of the universe. The IPSI BgD Transactions Advanced Research, 4, 5--6.
[23]
P. Fuentes, E. Vallejo, C. Camarero, R. Beivide, and M. Valero. 2015. Throughput unfairness in dragonfly networks under realistic traffic patterns. In Proceedings of the IEEE International Conference on Cluster Computing. 801--808.
[24]
M. García, E. Vallejo, R. Beivide, M. Odriozola, C. Camarero, M. Valero, G. Rodríguez, J. Labarta, and C. Minkenberg. 2012. On-the-fly adaptive routing in high-radix hierarchical networks. In Proceedings of the International Conference on Parallel Processing. 279--288.
[25]
A. Grama, A. Gupta, V. Karypis, and V. Kumar. 2003. Introduction to Parallel Computing, 2nd ed. Pearson Education Limited, Essex, England.
[26]
T. Hoefler, T. Schneider, and A. Lumsdaine. 2008. Multistage switches are not crossbars: Effects of static routing in high-performance networks. In Proceedings of the IEEE International Conference on Cluster Computing, 116--125.
[27]
S. V. Jeffrey, R. A. Sadaf, H. D. Thomas, Jr., R. F. Mark, C. R. Philip, and H. W. Patrick. 2006. Early evaluation of the cray XT3. In Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium. 1--10.
[28]
D. J. Kerbyson and P. W. Jones. 2005. A performance model of the parallel ocean program. International J. High Perform. Comput. Appl. 19, 3, 261--276.
[29]
E. J. Kim, G. M. Link, K. H. Yum, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, and C. R. Das. 2005a. A holistic approach to designing energy-efficient cluster interconnects. IEEE Trans. Comput. 54, 660--671.
[30]
J. Kim, W. J. Dally, S. Scott, and D. Abts. 2008. Technology-driven, highly-scalable dragonfly topology. 35th International Symposium on Computer Architecture, 77--88.
[31]
J. Kim, W. J. Dally, B. Towles, and A. K. Gupta. 2005b. Microarchitecture of a high radix router. In Proceedings 32nd International Symposium on Computer Architecture, 420--431.
[32]
C. E. Kozyrakis, S. Perissakis, D. Patterson, T. Anderson, K. Asanovic, N. Cardwell, R. Fromm, J. Golbus, B. Gribstad, K. Keeton, R. Thomas, N. Treuhaft, and K. Yelick. 1997. Scalable processors in the billion-transistor era: IRAM. Computer, 30, 9, 75--78.
[33]
J. Laudon and D. Lenoski. 1997. The SGI Origin: A ccNUMA highly scalable server. SIGARCH Comput. Archit. News, 25, 2, 241--251.
[34]
W. Lawry, C. Wilson, A. B. Maccabe, and R. Brightwell. 2002. COMB: A portable benchmark suite for assessing MPI overlap. In Proceedings of the IEEE International Conference on Cluster Computing (ICCC’02).472--475.
[35]
C. E. Leiserson. 1985. Fat-trees - universal networks for hardware-efficient supercomputing. IEEE Trans. Comput. 34, 10, 892--901.
[36]
P. Luszczek, J. J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. McCalpin, D. Bailey, and D. Takahashi. 2005. Introduction to the HPC Challenge Benchmark Suite. Electronic Book.
[37]
V. Marjanović, J. Labarta, E. Ayguadé, and M. Valero. 2010. Overlapping communication and computation by using a hybrid MPI/SMPSs approach. In Proceedings of the 24th ACM International Conference on Supercomputing. 5--16.
[38]
C. Martínez, E. Vallejo, R. Beivide, C. Izu, and M. Moretó. 2006. Dense gaussian networks: suitable topologies for on-chip multiprocessors. Int. J. Parallel Program. 34, 3, 193--211.
[39]
Mellanox. 2013. Mellanox company site. Sunnyvale, California. Retrieved from http://www.mellanox.com.
[40]
NNSA. 2013. Advanced Simulation & Computing. National Nuclear Security Administration, USA. Retrieved from http://www.nnsa.energy.gov/asc.
[41]
M. Nüssle, H. Fröning, S. Kapferer, and U. Brüning. 2013. Accelerate communication, not computation! In High-Performance Computing Using FPGAs, 507--542.
[42]
R. Peñaranda, C. Gómez, M. E. Gómez, P. López, and J. Duato. 2016. The k-ary n-direct s-indirect family of topologies for large-scale interconnection networks. J. Supercomput, 72, 1035--1062.
[43]
S. Scott, D. Abts, J. Kim, and W. J. Dally. 2006. The blackwidow high-radix clos network. In Proceedings of the 33rd International Symposium on Computer Architecture, 16--28.
[44]
G. Shainer, T. Liu, J. Liberman, J. Layton, O. Celebioglu, S. A. Schultz, J. Mora, D. Cownie, and V. Holst. 2009. LS-DYNA productivity and power-aware simulations in cluster environments. In Proceedings of the 7th European LS-DYNA Conference.
[45]
E. Stafford, J. L. Bosque, C. Martinez, F. Vallejo, R. Beivide, and C. Camarero. 2010. A first approach to king topologies for on-chip networks. In Proceedings of the 16th International Euro-Par Conference on Parallel Processing: Part II, 428--439.
[46]
V. Subotic, J. C. Sancho, J. Labarta, and M. Valero. 2010. A simulation framework to automatically analyze the communication-computation overlap in scientific applications. In Proceedings of the IEEE International Conference on Cluster Computing. 275--283.
[47]
M. A. Taubenblatt. 2012. Optical interconnects for high-performance computing. J. Lightwave Technol. 30, 448--458.
[48]
TheBlueGene/LTeam. 2002. An overview of the bluegene/l supercomputer. In Proceedings of the ACM/IEEE 2002 Conference on Supercomputing. 1--22.
[49]
TheBlueGene/PTeam. 2008. Overview of the IBM blue gene/p project. IBM J. Res. Dev. 52, 1.2, 199--220.
[50]
Top500. 2015. Top500 supercomputers site. Retrieved from http://www.top500.org.
[51]
R. Trobec. 2000. Two-dimensional regular d-meshes. Parallel Comput. 26, 13--14, 1945--1953.
[52]
R. Trobec. 2009. Evaluation of d-mesh interconnect for SoC. Parallel Processing Workshops, 2009. ICPPW’09. International Conference on. 507--512.
[53]
R. Trobec, U. Borštnik, and D. Janežič. 2009. Communication performance of d-meshes in molecular dynamics simulation. J. Math. Chem. 45, 2, 503--512.
[54]
C. Vaughan, M. Rajan, R. Barrett, D. Doerfler, and K. Pedretti. 2011. Investigating the impact of the cielo cray XE6 architecture on scientific application codes. IEEE International Symposium on Parallel and Distributed Processing, 1831--1837.
[55]
P. H. Worley, R. F. Barrett, and J. A. Kuehn. 2009. Early evaluation of the cray XT5. Cray User Group Conference. New York, NY.

Cited By

View all
  • (2024)Merging control-flow and dataflow architectures on a single chipJournal of Computer and Forensic Sciences10.5937/jcfs3-493923:1(33-44)Online publication date: 2024
  • (2024)On the Performance Investigation of a Recursive Fast Optical Switch-Based High Performance Computing Network ArchitectureIEEE/ACM Transactions on Networking10.1109/TNET.2023.330265032:1(777-790)Online publication date: Feb-2024
  • (2024)The diameter of rectangular twisted torusTheoretical Computer Science10.1016/j.tcs.2024.1146101003(114610)Online publication date: Jul-2024
  • Show More Cited By

Index Terms

  1. Interconnection Networks in Petascale Computer Systems: A Survey

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 49, Issue 3
      September 2017
      658 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/2988524
      • Editor:
      • Sartaj Sahni
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 September 2016
      Accepted: 01 July 2016
      Revised: 01 July 2016
      Received: 01 November 2015
      Published in CSUR Volume 49, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Interconnection networks
      2. Top500 list
      3. exascale computers
      4. high performance parallel computers

      Qualifiers

      • Survey
      • Research
      • Refereed

      Funding Sources

      • Slovenian Research Agency

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)54
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 10 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Merging control-flow and dataflow architectures on a single chipJournal of Computer and Forensic Sciences10.5937/jcfs3-493923:1(33-44)Online publication date: 2024
      • (2024)On the Performance Investigation of a Recursive Fast Optical Switch-Based High Performance Computing Network ArchitectureIEEE/ACM Transactions on Networking10.1109/TNET.2023.330265032:1(777-790)Online publication date: Feb-2024
      • (2024)The diameter of rectangular twisted torusTheoretical Computer Science10.1016/j.tcs.2024.1146101003(114610)Online publication date: Jul-2024
      • (2024)Drawbacks of Programming Dataflow Architectures and Methods to Overcome ThemApplied Artificial Intelligence 2: Medicine, Biology, Chemistry, Financial, Games, Engineering10.1007/978-3-031-60840-7_9(57-70)Online publication date: 25-May-2024
      • (2023)Complexity Analysis of Benes Network and Its Derived Classes via Information Functional Based EntropiesSymmetry10.3390/sym1503076115:3(761)Online publication date: 20-Mar-2023
      • (2022)A Survey of High-Performance Interconnection Networks in High-Performance Computer SystemsElectronics10.3390/electronics1109136911:9(1369)Online publication date: 25-Apr-2022
      • (2022)Energy efficient implementation of tensor operations using dataflow paradigm for machine learning10.1016/bs.adcom.2021.11.011(151-199)Online publication date: 2022
      • (2022)An analytically derived vectorized model for application graph mapping in interconnection networksJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-021-03637-414:7(8899-8911)Online publication date: 24-Jan-2022
      • (2021)The Ultimate Data Flow for Ultimate Super Computers-on-a-ChipHandbook of Research on Methodologies and Applications of Supercomputing10.4018/978-1-7998-7156-9.ch021(312-318)Online publication date: 2021
      • (2021)Mind Genomics With Big Data for Digital Marketing on the InternetHandbook of Research on Methodologies and Applications of Supercomputing10.4018/978-1-7998-7156-9.ch017(282-289)Online publication date: 2021
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media