[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/ISCA.2005.34acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article

Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling

Published: 01 May 2005 Publication History

Abstract

This paper examines the area, power, performance, and design issues for the on-chip interconnects on a chip multiprocessor, attempting to present a comprehensive view of a class of interconnect architectures. It shows that the design choices for the interconnect have significant effect on the rest of the chip, potentially consuming a significant fraction of the real estate and power budget. This research shows that designs that treat interconnect as an entity that can be independently architected and optimized would not arrive at the best multi-core design. Several examples are presented showing the need for careful co-design. For instance, increasing interconnect bandwidth requires area that then constrains the number of cores or cache sizes, and does not necessarily increase performance. Also, shared level-2 caches become significantly less attractive when the overhead of the resulting crossbar is accounted for. A hierarchical bus structure is examined which negates some of the performance costs of the assumed base-line architecture.

References

[1]
{1} International Technology Roadmap for Semiconductors 2003, http://public.itrs.net.
[2]
{2} Butterfly parallel processor overview. In BBN Report No 6148, Mar. 1986.
[3]
{3} A. Agarwal, J. Kubiatowicz, D. Kranz, B.-H. Lim, D. Yeung, G. D'Souza, and M. Parkin. Sparcle: An evolutionary processor design for large-scale multiprocessors. IEEE Micro, June 1993.
[4]
{4} J. Archibald and J.-L. Baer. Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst., 4(4):273-298, 1986.
[5]
{5} L. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A scalable architecture based on single-chip multiprocessing. In ISCA-27, 2000.
[6]
{6} J. Clabes, J. Friedrich, M. Sweet, J. DiLullo, S. Chu, D. Plass, J. Dawson, P. Muench, L. Powell, M. Floyd, B. Sinharoy, M. Lee, M. Goulet, J. Wagoner, N. Schwartz, S. Runyon, G. Gorman, P. Restle, R. Kalla, J. McGill, and S. Dodson. Design and implementation of the power5 microprocessor. In ISSCC, 2004.
[7]
{7} W. J. Dally and B. Towles. Route packets, not wires: On-chip interconnection networks. In DAC-38, pages 684-689, 2001.
[8]
{8} M. Dubois, C. Scheurich, and F. Briggs. Synchronization, coherence, and event ordering in multiprocessors. IEEE Computer, 21(2), 1988.
[9]
{9} R. J. Eickemeyer, R. E. Johnson, S. R. Kunkel, M. S. Squillante, and S. Liu. Evaluation of multithreaded uniprocessors for commercial application environments. In ISCA-23, 1996.
[10]
{10} S. J. Frank. Tightly coupled multiprocessor systems speed memory access times. In Electron, Jan. 1984.
[11]
{11} D. Gajski, D. Kuck, D. Lawrie, and A. Sameh. Cedar - a large scale multiprocessor. In ICPP, Aug. 1983.
[12]
{12} L. Hammond, B. A. Nayfeh, and K. Olukotun. A single-chip multiprocessor. IEEE Computer, 30(9), 1997.
[13]
{13} A. Hemani, A. Jantsch, S. Kumar, A. Postula, J. Oberg, M. Millberg, and D. Lindqvist. Network on chip: An architecture for billion transistor era. In IEEE NorChip Conference, Nov. 2000.
[14]
{14} M. Horowitz, R. Ho, and K. Mai. The future of wires. 1999.
[15]
{15} IBM. Power4:http://www.research.ibm.com/power4.
[16]
{16} IBM. Power5: Presentation at microprocessor forum. 2003.
[17]
{17} C. Kaanta, W. Cote, J. Cronin, K. Holland, P. Lee, and T. Wright. Submicron wiring technology with tungsten and planarization. In Fifth VLSI Multilevel Interconnection Conference, 1988.
[18]
{18} C. Kim, D. Burger, and S. Keckler. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In ASPLOS , 2002.
[19]
{19} R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-ISA Heterogeneous Multi-core Architectures: The Potential for Processor Power Reduction. In MICRO-36, Dec. 2003.
[20]
{20} R. Kumar, V. Zyuban, and D. Tullsen. Exploring interconnections in multi-core architectures. Technical report, University of California, San Diego, 2005.
[21]
{21} S. Kunkel, R. Eickemeyer, M. Lipasti, T. Mullins, B. Krafka, H. Rosenberg, S. VanderWiel, P. Vitale, and L. Whitley. A performance methodology for commercial servers. In IBM Journal of R&D, Nov. 2000.
[22]
{22} D. Lenoski, J. Laudon, K. Gharachorloo, W. Weber, A. Gupta, J. Henessy, M. Horowitz, and M. Lam. The stanford DASH multiprocessor. In IEEE Computer, 1992.
[23]
{23} T. Lovett and S. Thakkar. The symmetry multiprocessor system. In ICPP, Aug. 1988.
[24]
{24} M. Papamarcos and J. Patel. A low overhead coherence solution for multiprocessors with private cache memories. In ISCA-15, 1988.
[25]
{25} L.-S. Peh. Flow control and microarchitectural mechanisms for extending the performance of interconnection networks. PhD Thesis, Stanford University, 2001.
[26]
{26} G. Pfister, W. C. Brantley, D. A. George, S. L. Harvey, W. J. Kleinfelder, K. P. McAuliffe, E. A. Melton, V. A. Norton, and J. Weiss. The IBM Research Parallel Processor prototype (RP3): Introduction and Architecture. In ICPP, Aug. 1985.
[27]
{27} C. L. Seitz. The cosmic cube. In Communications of ACM, 1985.
[28]
{28} P. Shivakumar and N. Jouppi. CACTI 3.0: An integrated cache timing, power and area model. In Technical Report 2001/2, Compaq Computer Corporation, Aug. 2001.
[29]
{29} T. N. Theis. The future of interconnection technology. In IBM Journal of R&D, May 2000.
[30]
{30} J. Warnock, J. Keaty, J. Petrovick, J. Clabes, C. Kircher, B. Krauter, P. Restle, B. Zoric, and C. Anderson. The circuit and physical design of the Power4 microprocessor. In IBM Journal of R&D, Jan. 2002.
[31]
{31} A. Wilson. Hierarchical cache/bus architecture for shared memory multiprocessors. In ISCA-14, June 1987.

Cited By

View all
  • (2023)Machine Learning Enabled Solutions for Design and Optimization Challenges in Networks-on-Chip based Multi/Many-Core ArchitecturesACM Journal on Emerging Technologies in Computing Systems10.1145/359147019:3(1-26)Online publication date: 30-Jun-2023
  • (2022)MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining RoutersACM Transactions on Architecture and Code Optimization10.1145/351902719:3(1-23)Online publication date: 4-May-2022
  • (2022)CryoWire: wire-driven microarchitecture designs for cryogenic computingProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507749(903-917)Online publication date: 28-Feb-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '05: Proceedings of the 32nd annual international symposium on Computer Architecture
June 2005
541 pages
ISBN:076952270X
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 33, Issue 2
    ISCA 2005
    May 2005
    531 pages
    ISSN:0163-5964
    DOI:10.1145/1080695
    Issue’s Table of Contents

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 May 2005

Check for updates

Qualifiers

  • Article

Conference

ISCA05
Sponsor:

Acceptance Rates

ISCA '05 Paper Acceptance Rate 45 of 194 submissions, 23%;
Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Machine Learning Enabled Solutions for Design and Optimization Challenges in Networks-on-Chip based Multi/Many-Core ArchitecturesACM Journal on Emerging Technologies in Computing Systems10.1145/359147019:3(1-26)Online publication date: 30-Jun-2023
  • (2022)MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining RoutersACM Transactions on Architecture and Code Optimization10.1145/351902719:3(1-23)Online publication date: 4-May-2022
  • (2022)CryoWire: wire-driven microarchitecture designs for cryogenic computingProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507749(903-917)Online publication date: 28-Feb-2022
  • (2022)A router architecture with dual input and dual output channels for Networks-on-ChipMicroprocessors & Microsystems10.1016/j.micpro.2022.10446490:COnline publication date: 1-Apr-2022
  • (2017)An Efficient Self-Routing and Non-Blocking Interconnection Network on ChipProceedings of the 10th International Workshop on Network on Chip Architectures10.1145/3139540.3139546(1-6)Online publication date: 14-Oct-2017
  • (2016)FuMicroVLSI Design10.1155/2016/87879192016(2)Online publication date: 1-Dec-2016
  • (2016)Electro-Photonic NoC Designs for Kilocore SystemsACM Journal on Emerging Technologies in Computing Systems10.1145/296761413:2(1-25)Online publication date: 3-Nov-2016
  • (2016)Revisiting actor programming in C++Computer Languages, Systems and Structures10.1016/j.cl.2016.01.00245:C(105-131)Online publication date: 1-Apr-2016
  • (2015)Simple Virtual Channel Allocation for High-Throughput and High-Frequency On-Chip RoutersACM Transactions on Parallel Computing10.1145/27423492:1(1-23)Online publication date: 21-May-2015
  • (2015)Dynamic Cache Pooling in 3D Multicore ProcessorsACM Journal on Emerging Technologies in Computing Systems10.1145/270024712:2(1-21)Online publication date: 2-Sep-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media