[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2593069.2593146acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

GUARD: GUAranteed Reliability in Dynamically Reconfigurable Systems

Published: 01 June 2014 Publication History

Abstract

Soft errors are a reliability threat for reconfigurable systems implemented with SRAM-based FPGAs. They can be handled through fault tolerance techniques like scrubbing and modular redundancy. However, selecting these techniques statically at design or compile time tends to be pessimistic and prohibits optimal adaptation to changing soft error rate at runtime.
We present the GUARD method which allows for autonomous runtime reliability management in reconfigurable architectures: Based on the error rate observed during runtime, the runtime system dynamically determines whether a computation should be executed by a hardened processor, or whether it should be accelerated by inherently less reliable reconfigurable hardware which can trade-off performance and reliability. GUARD is the first runtime system for reconfigurable architectures that guarantees a target reliability while optimizing the performance. This allows applications to dynamically chose the desired degree of reliability. Compared to related work with statically optimized fault tolerance techniques, GUARD provides up to 68.3% higher performance at the same target reliability.

References

[1]
S. Vassiliadis and D. Soudris, Fine- and Coarse-Grain Reconfigurable Computing. Springer, 2007.
[2]
A. Avizienis et al., "Basic Concepts and Taxonomy of Dependable and Secure Computing", IEEE Trans. on Dep. and Secure Computing, vol. 1, no. 1, pp. 11--33, 2004.
[3]
C. Stroud, E. Lee, and M. Abramovici, "BIST-based diagnostics of FPGA logic blocks", in IEEE International Test Conference, 1997, pp. 539--547.
[4]
S. Mitra et al., "Reconfigurable architecture for autonomous self-repair", IEEE Design & Test of Comput. (D&ToC), vol. 21, no. 3, pp. 228--240, 2004.
[5]
H. Zhang et al., "Module Diversification: Fault Tolerance and Aging Mitigation for Runtime Reconfigurable Architectures", in IEEE Int'l Test Conference, 2013, paper 14.1.
[6]
"ISO 26262: Road vehicles -- Functional safety", ISO, 2011.
[7]
R. Baumann, "Soft errors in advanced computer systems", IEEE Design & Test of Computers, vol. 22, no. 3, pp. 258--266, May--June 2005.
[8]
C. Carmichael, M. Caffrey, and A. Salazar, "Correcting Single-Event Upsets Through Virtex Partial Configuration", Xilinx Application Note, XAPP216 (v1.0), 2000.
[9]
L. Bauer et al., "Test strategies for reliable runtime reconfigurable architectures", IEEE Transactions on Computers, vol. 62, no. 8, pp. 1494--1507, 2013.
[10]
S. Das et al., "RazorII: In Situ Error Detection and Correction for PVT and SER Tolerance", IEEE Journal of Solid-State Circuits, vol. 44, no. 1, pp. 32--48, 2009.
[11]
L. Sterpone, M. Porrmann, and J. Hagemeyer, "A novel fault tolerant and runtime reconfigurable platform for satellite payload processing", IEEE Transactions on Computers, vol. 62, no. 8, pp. 1508--1525, 2013.
[12]
A. Jacobs et al., "Reconfigurable fault tolerance: A comprehensive framework for reliable and adaptive FPGA-based space computing", ACM Trans. Reconfigurable Technol. Syst., vol. 5, no. 4, pp. 21:1--21:30, Dec. 2012.
[13]
Xilinx, "Zynq-7000: A Generation Ahead", Technology Backgrounder, 2013.
[14]
S. Mukherjee, Architecture design for soft errors. Morgan Kaufmann Publishers, 2008.
[15]
T. Sherwood, S. Sair, and B. Calder, "Phase tracking and prediction", ACM SIGARCH Computer Architecture News, vol. 31, no. 2, pp. 336--349, 2003.
[16]
S. Su, I. Koren, and Y. Malaiya, "A continuous-parameter markov model and detection procedures for intermittent faults", IEEE Trans. Comp., vol. C-27, no. 6, pp. 567--570, 1978.
[17]
K. Chapman, "SEU Strategies for Virtex-5 Devices", Xilinx Application Note, XAPP864 (v2.0), 2010.
[18]
E. Petersen, Single Event Effects in Aerospace. John Wiley & Sons, 2011.

Cited By

View all
  • (2022)Інформаційна технологія реконфігуровних систем на базі мікросхем програмованої логіки для керуючого модулю бездротової мережі датчиківAutomation of technological and business processes10.15673/atbp.v14i3.234914:3(20-26)Online publication date: 11-Oct-2022
  • (2020)Reconfigurable Framework for Environmentally Adaptive Resilience in Hybrid Space SystemsACM Transactions on Reconfigurable Technology and Systems10.1145/339838013:3(1-32)Online publication date: 16-Jul-2020
  • (2020)Optimal Runtime Algorithm to Improve Fault Tolerance of Bus-Based Reconfigurable DesignsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.296178228:4(914-925)Online publication date: Apr-2020
  • Show More Cited By
  1. GUARD: GUAranteed Reliability in Dynamically Reconfigurable Systems

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      DAC '14: Proceedings of the 51st Annual Design Automation Conference
      June 2014
      1249 pages
      ISBN:9781450327305
      DOI:10.1145/2593069
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 June 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      DAC '14

      Acceptance Rates

      Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Інформаційна технологія реконфігуровних систем на базі мікросхем програмованої логіки для керуючого модулю бездротової мережі датчиківAutomation of technological and business processes10.15673/atbp.v14i3.234914:3(20-26)Online publication date: 11-Oct-2022
      • (2020)Reconfigurable Framework for Environmentally Adaptive Resilience in Hybrid Space SystemsACM Transactions on Reconfigurable Technology and Systems10.1145/339838013:3(1-32)Online publication date: 16-Jul-2020
      • (2020)Optimal Runtime Algorithm to Improve Fault Tolerance of Bus-Based Reconfigurable DesignsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.296178228:4(914-925)Online publication date: Apr-2020
      • (2019)Self-Adaptation for Availability in CPU-FPGA Systems Under Soft Errors2019 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)10.1109/AHS.2019.000-6(9-16)Online publication date: Jul-2019
      • (2019)Hybrid scheduling to enhance reliability of real-time tasks running on reconfigurable devicesThe Journal of Supercomputing10.1007/s11227-019-02976-6Online publication date: 27-Aug-2019
      • (2019)i-Core: A Runtime-Reconfigurable Processor Platform for Cyber-Physical SystemsEmbedded, Cyber-Physical, and IoT Systems10.1007/978-3-030-16949-7_1(1-36)Online publication date: 29-Jun-2019
      • (2018)FPGA-Based High-Performance Embedded Systems for Adaptive Edge Computing in Cyber-Physical Systems: The ARTICo3 FrameworkSensors10.3390/s1806187718:6(1877)Online publication date: 8-Jun-2018
      • (2018)Input-Aware Implication Selection Scheme Utilizing ATPG for Efficient Concurrent Error DetectionElectronics10.3390/electronics71002587:10(258)Online publication date: 17-Oct-2018
      • (2018)Self-Test and Diagnosis for Self-Aware SystemsIEEE Design & Test10.1109/MDAT.2017.276290335:5(7-18)Online publication date: Oct-2018
      • (2018)Error Indication Signal Collapsing for Implication-Based Concurrent Error Detection2018 IEEE International Test Conference in Asia (ITC-Asia)10.1109/ITC-Asia.2018.00032(127-132)Online publication date: Aug-2018
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media