[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2485288.2485367acmconferencesArticle/Chapter ViewAbstractPublication PagesdateConference Proceedingsconference-collections
research-article

Overcoming post-silicon validation challenges through quick error detection (QED)

Published: 18 March 2013 Publication History

Abstract

Existing post-silicon validation techniques are generally ad hoc, and their cost and complexity are rising faster than design cost. Hence, systematic approaches to post-silicon validation are essential. Our research indicates that many of the bottlenecks of existing post-silicon validation approaches are direct consequences of very long error detection latencies. Error detection latency is the time elapsed between the activation of a bug during post-silicon validation and its detection or manifestation as a system failure. In our earlier papers, we created the Quick Error Detection (QED) technique to overcome this significant challenge. QED systematically creates a wide variety of post-silicon validation tests to detect bugs in processor cores and uncore components of multi-core System-on-Chips (SoCs) very quickly, i.e., with very short error detection latencies. In this paper, we present an overview of QED and summarize key results: 1. Error detection latencies of "typical" post-silicon validation tests can range up to billions of clock cycles. 2. QED shortens error detection latencies by up to 6 orders of magnitude. 3. QED enables 2- to 4-fold improvement in bug coverage. QED does not require any hardware modification. Hence, it is readily applicable to existing designs.

References

[1]
{Abramovici 90} Abramovici, M., M. A. Breuer, and A. D. Friedman, Digital Systems Testing and Testable Design, IEEE Press, 1990.
[2]
{Abramovici 06} Abramovici, M., "A Reconfigurable Design-for-Debug Infrastructure for SoCs," Proc. IEEE/ACM Design Automation Conf., pp. 7--12, 2006.
[3]
{Adir 10} Adir, A., et al., "Reaching Coverage Closure in Post-Silicon Validation," Proc. of 6th Haifa Verification Conf., LNCS 6504, pp. 60--74. Springer-Verlag, 2010.
[4]
{Adir 11} Adir, A., et al., "A Unified Methodology for Pre-Silicon Verification and Post-silicon Validation," Proc. IEEE/ACM Design, Automation and Test in Europe Conf., pp. 1--6, 2011.
[5]
{Aharon 95} Aharon, A., et al., "Test Program Generation for Functional Verification of PowerPC Processors in IBM," Proc. IEEE/ACM Design Automation Conf., pp 279--285, 1995.
[6]
{Amyeen 09}, Amyeen, M. E., S. Venkataraman, and M. W. Mak, "Microprocessor System Failures Debug and Fault Isolation Methodology," Proc. IEEE Intl. Test Conf., pp. 1--10, 2009.
[7]
{Bardell 87} Bardell, P. H., W. H. McAnney, and J. Savir, Built-In Test for VLSI: Pseudorandom Techniques, Wiley, 1987.
[8]
{Basu 11} Basu, K., and P. Mishra, "Efficient Trace Signal Selection for Post-Silicon Validation and Debug," Proc. IEEE Intl. Conf. on VLSI Design, pp. 352--357, 2011.
[9]
{Bayazit 05} Bayazit, A. A., and S. Malik, "Complementary Use of Runtime Validation and Model Checking," Proc. IEEE/ACM Intl. Conf. on Computer-aided Design, pp. 1049--1056, 2005.
[10]
{Bentley 01} Bentley, B., and R. Gray, "Validating the Intel Pentium 4 Processor." Intel Technology Journal, Vol. 5 Issue 1, pp. 1--8, February, 2001.
[11]
{Bohr 09} Bohr, M., "The New Era of Scaling in an SoC World," IEEE Solid-State Circuits Conf., pp 23--28, 2009.
[12]
{Bose 12} Bose, P., et al., "Power Management of Multi-Core Chips: Challenges and Pitfalls," Proc. IEEE/ACM Design, Automation and Test in Europe Conf., pp. 977--982, 2012.
[13]
{Boule 07} Boule, M., J.-S. Chenard, and Z. Zilic, "Assertion Checkers in Verification, Silicon Debug and In-Field Diagnosis," Proc. IEEE Intl. Symp. on Quality Electronic Design, pp. 613--620, 2007.
[14]
{Chandy 83} Chandy, K. M., J. Misra, and L. M. Haas, "Distributed Deadlock Detection," ACM Trans. on Computer Systems, Vol. 1, Issue 2, pp. 144--156, May 1983.
[15]
{Conroy 05} Conroy, Z., G. Richmond, X. Gu, and B. Eklow, "A Practical Perspective on Reducing ASIC NTFs," Proc. IEEE Intl. Test Conf., pp. 1--7, 2005.
[16]
{Constantinides 08} Constantinides, K., O. Mutlu, and T. Austin, "Online Design Bug Detection: RTL Analysis, Flexible Mechanisms, and Evaluation," Proc. IEEE/ACM Intl. Symp. on Microarchitecture, pp. 282--293, 2008.
[17]
{DeOrio 08} DeOrio, A., A. Bauserman, and V. Bertacco, "Post-Silicon Verification for Cache Coherence," Proc. IEEE Intl. Conf. on Computer Design, pp. 348--355, 2008.
[18]
{DeOrio 09} DeOrio, A., I. Wagner, and V. Bertacco, "DACOTA: Post-silicon Validation of the Memory Subsystem in Multi-Core Designs," Proc. IEEE Intl. Symp. on High-Performance Computer Architecture, pp. 405--416, 2009.
[19]
{Dongarra 03} Dongarra, J., P. Luszczek, and A. Petitet, "The LINPACK Benchmark: Past, Present and Future," Concurrency and Computation: Practice and Experience. Vol. 15, Issue 9, pp. 803--820, 2003.
[20]
{Eichelberger 77} Eichelberger, E. B., and T. W. Williams, "A Logic Design Structure for LSI Testability," Proc. IEEE/ACM Design Automation Conf., pp. 462--468. 1977.
[21]
{Hamzaoglu 99} Hamzaoglu, I., and J. H. Patel, "Reducing Test Application Time for Full Scan Embedded Cores," Proc. Intl. Symp. Fault-Tolerant Computing, pp. 260--267, 1999.
[22]
{Ho 95} Ho, R. C., et al. "Architecture Validation for Processors," Proc. ACM/IEEE Intl. Symp. on Computer Architecture, pp. 404--413, 1995.
[23]
{Hong 10} Hong, T. et al., "QED: Quick Error Detection Tests for Effective Post-Silicon Validation," Proc. IEEE Intl. Test Conf., pp. 1--10, 2010.
[24]
{Jas 98} Jas, A., and N. A. Touba, "Test Vector Decompression via Cyclical Scan Chains and its Application to Testing Core-base Designs," Proc. IEEE Intl. Test Conf., pp. 458--464, 1998.
[25]
{Josephson 06} Josephson, D., "The Good, the Bad, and the Ugly of Silicon Debug," Proc. IEEE/ACM Design Automation Conf., pp. 3--6, 2006.
[26]
{Keshava 10} Keshava, J., N. Hakim, and C. Prudvi, "Post-silicon Validation Challenges: How EDA and Academia Can Help," Proc. IEEE/ACM Design Automation Conf., pp. 3--7, 2010.
[27]
{Ko 08}, Ko, and H. F., Nicolici, "Automated Trace Signals Identification and State Restoration for Improving Observability in Post-Silicon Validation," Proc. IEEE/ACM Design, Automation and Test in Europe Conf., pp. 1298--1303, 2008.
[28]
{Koenemann 91} Koenemann, B., "LFSR-Coded Test Patterns for Scan Designs," Proc. IEE European Test Conference, pp. 237--242, 1991
[29]
{Kubicki 07} Kubicki, K., "Understanding AMD's 'TLB' Processor Bug," Daily Tech, http://www.dailytech.com/Understanding++AMDs+TLB+Processor+Bug/article9915.htm, 2007.
[30]
{Lin 12} Lin, D., et al., "Quick Detection of Difficult Bugs for Effective Post-Silicon Validation," Proc. IEEE/ACM Design Automation Conf., pp. 561--566, 2012.
[31]
{Liu 09} Liu, X., and Q. Xu, "Trace Signal Selection for Visibility Enhancement in Post-Silicon Validation," Proc. IEEE/ACM Design, Automation and Test in Europe Conf., pp. 1338--1343, 2009.
[32]
{Lovellette 02} Lovellette, M. N. et al., "Strategies for Fault-tolerant Space-based Computing: Lessons Learned from the ARGOS Testbed," Proc. Aerospace Conf., pp. 5-2109--5-2119, 2002.
[33]
{Ma 95} Ma, S. C., P. Franco, and E. J. McCluskey, "An Experimental Chip to Evaluate Test Techniques Experiment Results," Proc. IEEE Intl. Test Conf., pp. 663--672, 1995.
[34]
{Martin 05} Martin, M., et al., "Multifacet's General Execution-Drive Multiprocessor Simulator (GEMS) Toolset," ACM SIGARCH Computer Architecture News, Vol. 33, Issue 4, pp. 92--99, November, 2005.
[35]
{Maxwell 93} Maxwell, P., and R. Aitken, "Test Sets and Reject Rates: All Fault Coverages are Not Created Equal," IEEE Design & Test of Computers, Vol. 10, No. 1, pp. 42--51, 1993.
[36]
{McCluskey 86} McCluskey, E. J., Logic Design Principles with Emphasis on Testable Semicustom Circuits, Prentice-Hall, Englewood Cliffs, NJ, 1986.
[37]
{McCluskey 00} McCluskey, E. J. and C. W. Tseng, "Stuck-at Fault versus Actual Defects", Proc. IEEE Intl. Test Conf., pp. 336--343, 2000.
[38]
{McCluskey 04} McCluskey, E. J., et al., "ELF-Murphy Data on Defects and Tests Sets," Proc. IEEE VLSI Test Symposium, pp. 16--22, 2004.
[39]
{Miczo 86} Miczo, A., Digital Logic Testing and Simulation, Harper & Row, NY, NY, 1986.
[40]
{Mitra 04} Mitra, S., and K. S. Kim, "X-Compact: An Efficient Response Compaction Technique," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 23, Issue 3, pp. 421--432, 2004.
[41]
{Mitra 10} Mitra, S., S. A. Seshia, and N. Nicolici, "Post-Silicon Validation Opportunities, Challenges and Recent Advances," Proc. IEEE/ACM Design Automation Conf., pp. 12--17, 2010.
[42]
{Nigh 97} Nigh, P., et al., "So What Is an Optimal Test Mix: A Discussion of the SEMATECH Methods Experiment," Proc. IEEE Intl. Test Conf., pp. 1037--1038, 1997.
[43]
{Oh 02a} Oh, N., P. P. Shirvani, and E. J. McCluskey, "Error Detection by Duplicated Instructions in Super-Scalar Processors," IEEE Trans. on Reliability, Vol. 51, Issue 1, pp. 63--75, 2002.
[44]
{Oh 02b} Oh, N., S. Mitra, and E. J. McCluskey, "ED4I: Error Detection by Diverse Data and Duplicated Instructions," IEEE Trans. on Computers, Vol. 51, Issue 2, pp. 180--199, 2002.
[45]
{Oh 02c} Oh, N., P. P. Shirvani, and E. J. McCluskey, "Control Flow Checking by Software Signatures," IEEE Trans. on Reliability, Vol. 51, Issue 1, pp. 111--122, 2002.
[46]
{OpenSPARC} "OpenSPARC: World's First Free 64-bit Microprocessor," http://www.opensparc.net.
[47]
{Patra 07} Patra, P., "On the Cusp of a Validation Wall," IEEE Design & Test of Computers, Vol. 24, Issue 2, pp. 193--196, March, 2007.
[48]
{Raina 98} Raina, R., and R. Molyneaux, "Random Self-Test Method Applications on PowerPC#8482; microprocessor cache," Proc. ACM/IEEE Great Lakes Symp. VLSI, pp. 222--229, 1998.
[49]
{Reick 12}, Reick, K., "Post-Silicon Debug," DAC Workshop on Post-Silicon Debug: Technologies, Methodologies, and Best-Practices. IEEE/ACM Design Automation Conf., 2012.
[50]
{Shankland 05} Shankland, S., "Intel Pushes Back Itanium Chips, Revamps Xeon," CNET, http://news.cnet.com/Intel-pushes-back-Itanium-chips,-revamps-Xeon/2100-1006_3-5911316.html, 2005.
[51]
{Shimpi 11} Shimpi, A. L., "The Source of Intel's Cougar Point SATA Bug," AnandTech, http://www.anandtech.com/show/4143/the-source-of-intels-cougar-point-sata-bug, 2011.
[52]
{Singerman 11} Singerman, E., Y. Abarbanel, and S. Baartmans, "Transaction Based Pre-To-Post Silicon Validation," Proc. IEEE/ACM Design Automation Conf., pp. 564--568, 2011.
[53]
{Touba 01} Touba, N. A., and E. J. McCluskey, "Bit Fixing in Pseudorandom Sequences for Scan BIST," IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, pp. 545--555, 2001.
[54]
{Van Campenhout 00} Van Campenhout, D., et al., "Collection and Analysis of Microprocessor Design Errors," IEEE Design & Test of Computers, Vol. 17, Issue 4, pp. 51--60, Oct-Dec, 2000.
[55]
{Velev 03} Velev, M. N., "Collection of High-Level Microprocessor Bugs from Formal Verification of Pipelined and Superscalar Designs", Proc. IEEE Intl. Test Conf., pp. 138--147, September, 2003.
[56]
{Wagner 08} Wagner, I., and V. Bertaco, "Reversi: Post-Silicon Validation System for Modern Microprocessors," Proc. IEEE Intl. Conf. on Computer Design, pp. 307--314, 2008.
[57]
{Williams 73} Williams, M. J. Y., and J. B. Angell, "Enhancing Testibility of Large Scale Integrated Circuits via Test Points and Additional Logic," IEEE Trans. on Computers, Vol. C-22, Issue 1, pp. 46--60, 1973.
[58]
{Woo 95} Woo, S. C., et al., "The SPLASH-2 Programs: Characterization and Methodological Considerations," Proc. ACM/IEEE Intl. Symp. on Computer Architecture, pp. 24--36, 1995
[59]
{Yerramilli 06} Yerramilli, S., "Addressing Post-Silicon Validation Challenges: Leverage Validation & Test Synergy (Invited Address)," IEEE Intl. Test Conf., 2006.

Index Terms

  1. Overcoming post-silicon validation challenges through quick error detection (QED)

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        DATE '13: Proceedings of the Conference on Design, Automation and Test in Europe
        March 2013
        1944 pages
        ISBN:9781450321532

        Sponsors

        Publisher

        EDA Consortium

        San Jose, CA, United States

        Publication History

        Published: 18 March 2013

        Check for updates

        Author Tags

        1. debug
        2. post-silicon validation
        3. quick error detection
        4. testing
        5. verification

        Qualifiers

        • Research-article

        Conference

        DATE 13
        Sponsor:
        • EDAA
        • EDAC
        • SIGDA
        • The Russian Academy of Sciences
        DATE 13: Design, Automation and Test in Europe
        March 18 - 22, 2013
        Grenoble, France

        Acceptance Rates

        Overall Acceptance Rate 518 of 1,794 submissions, 29%

        Upcoming Conference

        DATE '25
        Design, Automation and Test in Europe
        March 31 - April 2, 2025
        Lyon , France

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 170
          Total Downloads
        • Downloads (Last 12 months)5
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 19 Dec 2024

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media