[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Power and performance evaluation of globally asynchronous locally synchronous processors

Published: 01 May 2002 Publication History

Abstract

Due to shrinking technologies and increasing design sizes, it is becoming more difficult and expensive to distribute a global clock signal with low skew throughout a processor die. Asynchronous processor designs do not suffer from this problem since they do not have a global clock. However, a paradigm shift from synchronous to asynchronous is unlikely to happen in the processor industry in the near future. Hence the study of Globally Asynchronous Locally Synchronous (or GALS) systems is relevant. In this paper we use a cycle-accurate simulation environment to study the impact of asynchrony in a superscalar processor architecture. Our results show that as expected, going from a synchronous to a GALS design causes a drop in performance, but elimination of the global clock does not lead to drastic power reductions. From a power perspective, GALS designs are inherently less efficient when compared to synchronous architectures. However, the flexibility offered by the independently controllable local clocks enables the effective use of other energy conservation techniques like dynamic voltage scaling. Our results show that for a 5-clock domain GALS processor, the drop in performance ranges between 5-15%, while power consumption is reduced by 10% on the average. Fine-grained voltage scaling reduces the gap between fully synchronous and GALS implementations, allowing for better power efficiency.

References

[1]
I. E. Sutherland, "Micropipelines," Communications of the ACM, June 1989.
[2]
D. M. Chapiro, Globally Asynchronous Locally Synchronous Systems. PhD thesis, Stanford University, 1984.
[3]
S. B. Furber, D. A. Edwards, and J. D. Garside, "AMULET3: A 100 MIPS Asynchronous Embedded Processor," in Proc. Intl. Conference on Computer Design (ICCD), 2000.
[4]
T. Chelcea and S. M. Nowick, "A Low-Latency FIFO for Mixed-Clock Systems," in Proc. IEEE Computer Society Workshop on VLSI, 2000.
[5]
T. Chelcea and S. M. Nowick, "Robust Interfaces for Mixed-Timing Systems with Application to Latency-Insensitive Protocols," in Proc. Design Automation Conference (DAC), 2001.
[6]
A. Hemani, T. Meincke, S. Kumar, A. Postula, T. Olsson, P. Nilsson, J. Oberg, P. Ellervee, and D. Lundqvist, "Lower Power Consumption in Clock By Using Globally Asynchronous Locally Synchronous Design Style," in Proc. Design Automation Conference (DAC), 1999.
[7]
J. Muttersbach, T. Villiger, and W. Fitchner, "Practical Design of Globally Asynchronous Locally Synchronous Systems," in Proc. Intl. Symposium on Advanced Research in Asynchronous Circuits and Systems (ASYNC), 2000.
[8]
S. W. Moore, G. S. Taylor, P. A. Cunningham, R. D. Mullins, and P. Robinson, "Self Calibrating Clocks for Globally Asynchronous Locally Synchronous Circuits," in Proc. Intl. Conference on Computer Design (ICCD), 2000.
[9]
K.Y. Yun and A. E. Dooply, "Pausible Clocking-Based Heterogeneous Systems," IEEE Transactions on VLSI Systems, December 1999.
[10]
G. Semeraro, G. Magklis, R. Balasubramonian, D. H. Albonesi, S. Dwarakadas, and M. L. Scott, "Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling," in Proc. Intl. Symp. on High Performance Computer Architecture (HPCA), 2002.
[11]
P. J. Restle et al., "A Clock Distribution Network lot Microprocessors," IEEE Journal of Solid State Circuits (JSSC), May 2001.
[12]
D. W. Bailey and B. J. Benschneider, "Clocking Design and Analysis for a 600-MHz Alpha Microprocessor," IEEE Journal of Solid State Circuits (JSSC), Nov 1998.
[13]
S. Tam, S. Rusu, U. N. Desai, R. Kim, J. Zhang, and I. Young, "Clock Generation and Distribution for the First IA-64 Microprocessor," IEEE Journal of Solid State Circuits (JSSC), November 2000.
[14]
J. M. Rabaey, Digital Integrated Circuits: A Design Perspective. Prentice Hall, 1996.
[15]
K. Chen and C. Hu, "Performance and Vdd Scaling in Deep Submicrometer CMOS," IEEE Journal of Solid State Circuits (JSSC), October 1998.
[16]
D. Burger and T. M. Austin, "The SimpleScalar Tool Set, version 2.0," Tech. Rep. 1342, University of Wisconsin-Madison, CS Department, June 1997.
[17]
D. Brooks, V. Tiwari, and M. Martonosi, "Wattch: A Framework for Architectural-level Power Analysis and Optimizations," in Proc. Intl Symp on Computer Architecture (ISCA), 2000.
[18]
"Spec95 Benchmarks." http://www.spec.org.
[19]
C. Lee, M. Potkonjak, and W. H. Mangione-Smith, "Mediabench: a Tool for Evaluating and Synthesizing Multimedia and Communications Systems," in International Symposium on Microarchitecture (MICRO), 1997.

Cited By

View all

Index Terms

  1. Power and performance evaluation of globally asynchronous locally synchronous processors

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM SIGARCH Computer Architecture News
      ACM SIGARCH Computer Architecture News  Volume 30, Issue 2
      Special Issue: Proceedings of the 29th annual international symposium on Computer architecture (ISCA '02)
      May 2002
      304 pages
      ISSN:0163-5964
      DOI:10.1145/545214
      Issue’s Table of Contents
      • cover image ACM Conferences
        ISCA '02: Proceedings of the 29th annual international symposium on Computer architecture
        May 2002
        346 pages
        ISBN:076951605X
        • Conference Chair:
        • Yale Patt,
        • Program Chair:
        • Dirk Grunwald,
        • Publications Chair:
        • Kevin Skadron

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 May 2002
      Published in SIGARCH Volume 30, Issue 2

      Check for updates

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)19
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Embedded System HardwareEmbedded System Design10.1007/978-3-030-60910-8_3(127-201)Online publication date: 26-Jan-2021
      • (2017)Models of computation for NoC mapping: Timing and energy saving awarenessMicroelectronics Journal10.1016/j.mejo.2016.09.00560(129-143)Online publication date: Feb-2017
      • (2015)An Overview of Architecture-Level Power- and Energy-Efficient Design Techniques10.1016/bs.adcom.2015.04.001(1-57)Online publication date: 2015
      • (2014)Globally Asynchronous Locally Synchronous Design Based Heterogeneous Multi-core SystemICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India- Vol I10.1007/978-3-319-03107-1_81(739-748)Online publication date: 2014
      • (2013)Step persistence in the design of GALS systemsProceedings of the 34th international conference on Application and Theory of Petri Nets and Concurrency10.1007/978-3-642-38697-8_11(190-209)Online publication date: 24-Jun-2013
      • (2011)Thread shufflingProceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design10.5555/2016802.2016892(379-384)Online publication date: 1-Aug-2011
      • (2011)A phase adaptive cache hierarchy for SMT processorsMicroprocessors & Microsystems10.1016/j.micpro.2011.08.00835:8(683-694)Online publication date: 1-Nov-2011
      • (2011)Low-energy GALS NoC with FIFO-Monitoring dynamic voltage scalingMicroelectronics Journal10.1016/j.mejo.2011.03.01642:6(889-896)Online publication date: 1-Jun-2011
      • (2010)Combined Use of Rising and Falling Edge Triggered Clocks for Peak Current Reduction in IP-Based SoC/NoC DesignsIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.E93.A.2581E93-A:12(2581-2589)Online publication date: 2010
      • (2010)Thread-management techniques to maximize efficiency in multicore and simultaneous multithreaded microprocessorsACM Transactions on Architecture and Code Optimization10.1145/1839667.18396717:2(1-25)Online publication date: 5-Oct-2010
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media