[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1787275.1787342acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Protective redundancy overhead reduction using instruction vulnerability factor

Published: 17 May 2010 Publication History

Abstract

Due to modern technology trends, fault tolerance (FT) is acquiring an ever increasing research attention. To reduce the overhead introduced by the FT features, several techniques have been proposed. One of these techniques is Instruction-Level Fault Tolerance Configurability (ILCOFT). ILCOFT enables application developers to protect different instructions at varying degrees, devoting more resources to protect the most critical instructions, and saving resources by weakening protection of other instructions. It is, however, not trivial to assign a proper protection level for every instruction. This work introduces the notion of Instruction Vulnerability Factor (IVF), which evaluates how faults in every instruction affect the final application output. The IVF is computed off-line, and is then used by ILCOFT-enabled systems to assign the appropriate protection level to every instruction. IVF releases the programmer from the need to assign the necessary protection level to every instruction by hand. Experimental results demonstrate that IVF-based ILCOFT reduces the instruction duplication performance penalty by up to 77%, while the maximum output damage due to undetected faults does not exceed 0.6% of the total application output.

References

[1]
P. Shivakumar, M. Kistler, S. Keckler, D. Burger, and L. Alvisi, "Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic," in DSN-02: Proc. 2002 Int. Conf. on Dependable Systems and Networks, Washington, DC, USA, 2002, pp. 389--398.
[2]
T. Rao and E. Fujiwara, Error-Control Coding for Computer Systems. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1989.
[3]
D. Borodin, B. Juurlink, and S. Vassiliadis, "Instruction-Level Fault Tolerance Configurability," in IC-SAMOS VII: Proc. Int. Conf. on Embedded Computer Systems: Architectures, Modeling, and Simulation, July 2007, pp. 110--117.
[4]
D. Borodin, B. Juurlink, S. Hamdioui, and S. Vassiliadis, "Instruction-Level Fault Tolerance Configurability," Journal of Signal Processing Systems, vol. 57, no. 1, pp. 89--105, October 2009.
[5]
A. Sundaram, A. Aakel, D. Lockhart, D. Thaker, and D. Franklin, "Efficient Fault Tolerance in Multi-Media Applications through Selective Instruction Replication," in WREFT-08: Proc. of the 2008 workshop on Radiation effects and fault tolerance in nanometer technologies. New York, NY, USA: ACM, 2008, pp. 339--346.
[6]
S. Mukherjee, C. Weaver, J. Emer, S. Reinhardt, and T. Austin, "A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor," in MICRO-36: Proc. of the 36th Annual IEEE/ACM Int. Symp. on Microarchitecture. Washington, DC, USA: IEEE Computer Society, 2003, p. 29.
[7]
T. Austin, E. Larson, and D. Ernst, "SimpleScalar: An Infrastructure for Computer System Modeling," Computer, vol. 35, no. 2, pp. 59--67, 2002.
[8]
M. Franklin, "A Study of Time Redundant Fault Tolerance Techniques for Superscalar Processors," Proc. IEEE Int. Workshop on Defect and Fault Tolerance in VLSI Systems, pp. 207--215, Nov 1995.
[9]
J. von Neumann, "Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components," in Automata Studies, ser. Annals of Mathematics Studies. Princeton, NJ: Princeton University Press, 1956, vol. 34, pp. 43--98.
[10]
B. Johnson, Design and Analysis of Fault-Tolerant Digital Systems. Addison-Wesley, Jan 1989.
[11]
Fibonacci numbers at Wikipedia, http://en.wikipedia.org/wiki/Fibonacci_number.
[12]
C. Lee, M. Potkonjak, and W. H. Mangione-Smith, "MediaBench: A Tool for Evaluating and Synthesizing Multimedia and Communicatons Systems," in MICRO-30: Proc. of the 30th Annual ACM/IEEE Int. Symp. on Microarchitecture. Washington, DC, USA: IEEE Computer Society, 1997, pp. 330--335.
[13]
N. Oh, P. P. Shirvani, and E. J. McCluskey, "Error Detection by Duplicated Instructions in Super-Scalar Processors," IEEE Transactions on Reliability, vol. 51, no. 1, pp. 63--75, Mar 2002.

Cited By

View all
  • (2024)TCC: GPGPU Architecture for Instruction Decoder and Control Flow Error Detection2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)10.1109/DDECS60919.2024.10508915(104-109)Online publication date: 3-Apr-2024
  • (2023)gemV-tool: A Comprehensive Soft Error Reliability Estimation Tool for Design Space ExplorationElectronics10.3390/electronics1222457312:22(4573)Online publication date: 8-Nov-2023
  • (2022)Braum: Analyzing and Protecting Autonomous Machine Software Stack2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE55969.2022.00019(85-96)Online publication date: Oct-2022
  • Show More Cited By

Recommendations

Reviews

Amos O Olagunju

Due to increasing pervasive vulnerabilities and attacks, emerging computer technologies require new and effective fault tolerance mechanisms and algorithms. How should fault tolerance mechanisms and algorithms be designed and implemented to minimize energy consumption, hardware overhead, and instruction processing performance attributable to fault tolerance features__?__ How should programmers assign an adequate level of security to crucial instructions for mission-critical applications__?__ Advocating for programmers to become more proficient at assigning the adequate protection level to every computer instruction, Borodin and Juurlink present a novel metric for offline evaluation of the effect of faults in individual instructions on the overall result of computer applications. The proposed plan requires a simulation environment or the injection of faults by hardware into every instruction, in order to generate the offline profile of vulnerability for each instruction. In the scheme, faults such as arithmetic operations and memory loads are injected into each executed instruction that generates results, and the application's output is weighed against the correct result to gauge the effect of the faulty instruction. The instruction vulnerability factor (IVF) is the percentage of output items or bytes it corrupts, depending on the type of application. The average IVF rates are estimated, stored, and used to enforce adequate protection, only for time-consuming components of an application. The authors outline a technique for utilizing the IVF rate to execute each instruction once with no error discovery, or to duplicate and validate its execution. Borodin and Juurlink perform simulations with different kernels and applications, such as image addition, matrix multiplication, sum of absolute difference, computation of Fibonacci numbers, sound compression, and encoders and decoders for image compression. The experimental results of the evaluation of individual instruction-level vulnerabilities show significant performance improvement over well-known instruction-level fault tolerance configuration techniques [1]. The paper clearly articulates the issues of imprecise IVF assessment due to uncertainties in instruction execution of dynamic real-world applications. The IVF estimation is only appropriate for applications with evenly significant rates, but the authors offer valuable insights into the efficient design and implementation of fault tolerance mechanisms and algorithms. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '10: Proceedings of the 7th ACM international conference on Computing frontiers
May 2010
370 pages
ISBN:9781450300445
DOI:10.1145/1787275
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. fault detection
  2. instruction vulnerability
  3. performance
  4. redundancy
  5. selective protection

Qualifiers

  • Research-article

Conference

CF'10
Sponsor:
CF'10: Computing Frontiers Conference
May 17 - 19, 2010
Bertinoro, Italy

Acceptance Rates

CF '10 Paper Acceptance Rate 30 of 113 submissions, 27%;
Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)TCC: GPGPU Architecture for Instruction Decoder and Control Flow Error Detection2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)10.1109/DDECS60919.2024.10508915(104-109)Online publication date: 3-Apr-2024
  • (2023)gemV-tool: A Comprehensive Soft Error Reliability Estimation Tool for Design Space ExplorationElectronics10.3390/electronics1222457312:22(4573)Online publication date: 8-Nov-2023
  • (2022)Braum: Analyzing and Protecting Autonomous Machine Software Stack2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE55969.2022.00019(85-96)Online publication date: Oct-2022
  • (2022)Paralellism-Based Techniques for Slowing Down Soft Error Propagation2022 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)10.1109/DASC/PiCom/CBDCom/Cy55231.2022.9927870(1-6)Online publication date: 12-Sep-2022
  • (2022)Studying error propagation on application data structure and hardwareThe Journal of Supercomputing10.1007/s11227-022-04625-x78:17(18691-18724)Online publication date: 1-Nov-2022
  • (2022)Regional soft error vulnerability and error propagation analysis for GPGPU applicationsThe Journal of Supercomputing10.1007/s11227-021-04026-678:3(4095-4130)Online publication date: 1-Feb-2022
  • (2022)Quantifying the impact of data replication on error propagationCluster Computing10.1007/s10586-022-03726-926:3(1985-1999)Online publication date: 13-Sep-2022
  • (2021)Characterizing System-Level Masking Effects against Soft ErrorsElectronics10.3390/electronics1018228610:18(2286)Online publication date: 17-Sep-2021
  • (2020)PB-IFMC: A Selective Soft Error Protection Method Based on Instruction Fault Masking Capability2020 25th International Computer Conference, Computer Society of Iran (CSICC)10.1109/CSICC49403.2020.9050059(1-9)Online publication date: Jan-2020
  • (2020)Software reliability enhancement against hardware transient errors using inherently reliable data structuresInternational Journal of System Assurance Engineering and Management10.1007/s13198-020-01011-911:5(883-898)Online publication date: 26-Jun-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media