More Web Proxy on the site http://driver.im/

Article

Distributed Microarchitectural Protocols in the TRIPS Prototype Processor

Authors:

Karthikeyan Sankaralingam,

Ramadass Nagarajan,

Robert McDonald,

Rajagopalan Desikan,

Saurabh Drolia,

M. S. Govindan,

Heather Hanson,

Nitya Ranganathan,

Simha Sethumadhavan,

Premkishore Shivakumar,

Stephen W. Keckler,

Doug BurgerAuthors Info & Claims

MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture

Pages 480 - 491

https://doi.org/10.1109/MICRO.2006.19

Published: 09 December 2006 Publication History

Abstract

Growing on-chip wire delays will cause many future microarchitectures to be distributed, in which hardware resources within a single processor become nodes on one or more switched micronetworks. Since large processor cores will require multiple clock cycles to traverse, control must be distributed, not centralized. This paper describes the control protocols in the TRIPS processor, a distributed, tiled microarchitecture that supports dynamic execution. It details each of the five types of reused tiles that compose the processor, the control and data networks that connect them, and the distributed microarchitectural protocols that implement instruction fetch, execution, flush, and commit. We also describe the physical design issues that arose when implementing the microarchitecture in a 170M transistor, 130nm ASIC prototype chip composed of two 16-wide issue distributed processor cores and a distributed 1MB nonuniform (NUCA) on-chip memory system.

References

[1]

{1} Arvind and R. S. Nikhil. Executing a program on the MIT Tagged-Token Dataflow Architecture. IEEE Transactions on Computers, 39(3):300-318, 1990.

Digital Library

[2]

{2} M. Budiu, G. Venkataramani, T. Chelcea, and S. C. Goldstein. Spatial computation. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 14- 26, October 2004.

Digital Library

[3]

{3} D. Burger, S. Keckler, K. McKinley, M. Dahlin, L. John, C. Lin, C. Moore, J. Burrill, R. McDonald, and W. Yoder. Scaling to the end of silicon with EDGE architectures. IEEE Computer, 37(7):44-55, July 2004.

Digital Library

[4]

{4} A. Cristal, O. J. Santana, M. Valero, and J. F.Martinez. Toward kiloinstruction processors. ACM Transactions on Architecture and Code Optimization, 1(4):389-417, December 2004.

Digital Library

[5]

{5} D. E. Culler, A. Sah, K. E. Schauser, T. von Eicken, and J. Wawrzynek. Fine-grain parallelism with minimal hardware support: A compiler-controlled threaded abstract machine. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 164-175, April 1991.

Digital Library

[6]

{6} J. Dennis and D. Misunas. A preliminary architecture for a basic data-flow processor. In International Symposium on Computer Architecture , pages 126-132, January 1975.

Digital Library

[7]

{7} B. Fields, S. Rubin, and R. Bodik. Focusing processor policies via critical-path prediction. In Proceedings of the 28th Annual International Symposium on Computer Architecture, pages 74-85, July 2001.

Digital Library

[8]

{8} E. Hao, P. Chang, M. Evers, and Y. Patt. Increasing the instruction fetch rate via block-structured instruction set architectures. In International Symposium on Microarchitecture, pages 191-200, December 1996.

Digital Library

[9]

{9} R. Iannucci. Toward a dataflow/von Neumann hybrid architecture. In International Symposium on Computer Architecture, pages 131-140, May 1988.

Digital Library

[10]

{10} R. Kessler. The Alpha 21264 microprocessor. IEEEMicro, 19(2):24- 36, March/April 1999.

Digital Library

[11]

{11} C. Kim, D. Burger, and S. W. Keckler. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 211-222, October 2002.

Digital Library

[12]

{12} P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32- way multithreaded Sparc processor. IEEE Micro, 25(2):21-29, March/April 2005.

Digital Library

[13]

{13} S. Mahlke, D. Lin, W. Chen, R. Hank, and R. Bringmann. Effective compiler support for predicated execution using the hyperblock. In International Symposium on Microarchitecture, pages 45-54, June 1992.

Digital Library

[14]

{14} K. Mai, T. Paaske, N. Jayasena, R. Ho, W. J. Dally, and M. Horowitz. Smart memories: A modular reconfigurable architecture. In International Symposium on Computer Architecture, pages 161-171, June 2000.

Digital Library

[15]

{15} R. Nagarajan, K. Sankaralingam, D. Burger, and S.W. Keckler. A design space evaluation of grid processor architectures. In International Symposium on Microarchitecture, pages 40-51, December 2001.

Digital Library

[16]

{16} D. Pham, S. Asano, M. Bolliger, M. Day, H. Hofstee, C. Johns, J. Kahle, A. Kameyama, J. Keaty, Y.Masubuchi, M. Riley, D. Shippy, D. Stasiak, M. Suzuoki, M. Wang, J. Warnock, S. Weitzel, D. Wendel, T. Yamazaki, and K. Yazawa. The design and implementation of a first-generation CELL processor. In International Solid-State Circuits Conference, pages 184-185, February 2005.

[17]

{17} S. Sethumadhavan, R. McDonald, R. Desikan, D. Burger, and S. W. Keckler. Design and implementation of the TRIPS primary memory system. In International Conference on Computer Design, October 2006.

[18]

{18} T. Sherwood, E. Perelman, and B. Calder. Basic block distribution analysis to find periodic behavior and simulation points in applications. In International Conference on Parallel Architectures and Compilation Technique, pages 3-14, September 2001.

Digital Library

[19]

{19} A. Smith, J. Burrill, J. Gibson, B. Maher, N. Nethercote, B. Yoder, D. Burger, and K. S. McKinley. Compiling for EDGE architectures. In International Symposium on Code Generation and Optimization, pages 185-195, March 2006.

Digital Library

[20]

{20} S. Srinivasan, R. Rajwar, H. Akkary, A. Ghandi, and M. Upson. Continual flow pipelines. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 107-119, October 2004.

Digital Library

[21]

{21} S. Swanson, K. Michaelson, A. Schwerin, and M. Oskin. Wavescalar. In 36th International Symposium on Microarchitecture, pages 291- 302, December 2003.

Digital Library

[22]

{22} M. Taylor, W. Lee, S. Amarasinghe, and A. Agarwal. Scalar operand networks: On-chip interconnect for ILP in partitioned architectures. In International Symposium on High Performance Computer Architecture , pages 341-353, February 2003.

Digital Library

[23]

{23} E. Waingold, M. Taylor, D. Srikrishna, V. Sarkar, W. Lee, V. Lee, J. Kim, M. Frank, P. Finch, R. Barua, J. Babb, S. Amarasinghe, and A. Agarwal. Baring it all to software: RAW machines. IEEE Computer , 30(9):86-93, September 1997.

Digital Library

Cited By

Chen KMason Nelson TKhadem AFayazi MSingapuram SDreslinski RTalati NKim HBlaauw D(2024)Canalis: A Throughput-Optimized Framework for Real-Time Stream Processing of Wireless CommunicationACM Transactions on Reconfigurable Technology and Systems10.1145/369588017:4(1-32)Online publication date: 18-Sep-2024
https://dl.acm.org/doi/10.1145/3695880
Wang LDeng YGong RShi WLuo LWang Y(2020)CSMO-DSEACM Journal on Emerging Technologies in Computing Systems10.1145/337140616:2(1-22)Online publication date: 30-Jan-2020
https://dl.acm.org/doi/10.1145/3371406
Nowatzki TArdalani NSankaralingam KWeng JEvripidou SStenström PO'Boyle M(2018)Hybrid optimization/heuristic instruction scheduling for programmable accelerator codesignProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243212(1-15)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.1145/3243176.3243212
Show More Cited By

Index Terms

Distributed Microarchitectural Protocols in the TRIPS Prototype Processor
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Hardware

Recommendations

TRIPS: A polymorphous architecture for exploiting ILP, TLP, and DLP

This paper describes the polymorphous TRIPS architecture that can be configured for different granularities and types of parallelism. The TRIPS architecture is the first in a class of post-RISC, dataflow-like instruction sets called explicit data-graph ...
An evaluation of the TRIPS computer system
ASPLOS 2009

The TRIPS system employs a new instruction set architecture (ISA) called Explicit Data Graph Execution (EDGE) that renegotiates the boundary between hardware and software to expose and exploit concurrency. EDGE ISAs use a block-atomic execution model in ...
An evaluation of the TRIPS computer system
ASPLOS 2009

The TRIPS system employs a new instruction set architecture (ISA) called Explicit Data Graph Execution (EDGE) that renegotiates the boundary between hardware and software to expose and exploit concurrency. EDGE ISAs use a block-atomic execution model in ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture

December 2006

493 pages

ISBN:0769527329

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing

Publisher

IEEE Computer Society

United States

Publication History

Published: 09 December 2006

Check for updates

Qualifiers

Article

Conference

Micro-39

Sponsor:

SIGMICRO

Micro-39: The 39th Annual IEEE/ACM International Symposium on Microarchitecture

December 9 - 13, 2006

Acceptance Rates

MICRO 39 Paper Acceptance Rate 42 of 174 submissions, 24%;

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

44
Total Citations
View Citations
359
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen KMason Nelson TKhadem AFayazi MSingapuram SDreslinski RTalati NKim HBlaauw D(2024)Canalis: A Throughput-Optimized Framework for Real-Time Stream Processing of Wireless CommunicationACM Transactions on Reconfigurable Technology and Systems10.1145/369588017:4(1-32)Online publication date: 18-Sep-2024
https://dl.acm.org/doi/10.1145/3695880
Wang LDeng YGong RShi WLuo LWang Y(2020)CSMO-DSEACM Journal on Emerging Technologies in Computing Systems10.1145/337140616:2(1-22)Online publication date: 30-Jan-2020
https://dl.acm.org/doi/10.1145/3371406
Nowatzki TArdalani NSankaralingam KWeng JEvripidou SStenström PO'Boyle M(2018)Hybrid optimization/heuristic instruction scheduling for programmable accelerator codesignProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243212(1-15)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.1145/3243176.3243212
Takano S(2017)Performance Scalability of Adaptive Processor ArchitectureACM Transactions on Reconfigurable Technology and Systems10.1145/300790210:2(1-22)Online publication date: 11-Apr-2017
https://dl.acm.org/doi/10.1145/3007902
Sharifian AKumar SGuha AShriraman AHsu WYang CLipasti MLee H(2016)CHAINSAWThe 49th Annual IEEE/ACM International Symposium on Microarchitecture10.5555/3195638.3195698(1-14)Online publication date: 15-Oct-2016
https://dl.acm.org/doi/10.5555/3195638.3195698
Zhou YHoffmann HWentzlaff D(2016)CASHACM SIGARCH Computer Architecture News10.1145/3007787.300120944:3(682-694)Online publication date: 18-Jun-2016
https://dl.acm.org/doi/10.1145/3007787.3001209
Jafri SDaneshtalab MAbbas NLeon GHemani A(2016)TransMapIEEE Transactions on Computers10.1109/TC.2016.252598165:11(3456-3469)Online publication date: 1-Nov-2016
https://dl.acm.org/doi/10.1109/TC.2016.2525981
Zhou YHoffmann HWentzlaff DMin SLoh G(2016)CASHProceedings of the 43rd International Symposium on Computer Architecture10.1109/ISCA.2016.65(682-694)Online publication date: 18-Jun-2016
https://dl.acm.org/doi/10.1109/ISCA.2016.65
Wu YLu CChen Y(2016)A survey of routing algorithm for mesh Network-on-ChipFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-016-5431-810:4(591-601)Online publication date: 1-Aug-2016
https://dl.acm.org/doi/10.1007/s11704-016-5431-8
Nowatzki TGangadhar VSankaralingam K(2015)Exploring the potential of heterogeneous von neumann/dataflow execution modelsACM SIGARCH Computer Architecture News10.1145/2872887.275038043:3S(298-310)Online publication date: 13-Jun-2015
https://dl.acm.org/doi/10.1145/2872887.2750380
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents