[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2751504.2751512acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
short-paper

Empirical Studies of the Soft Error Susceptibility ofSorting Algorithms to Statistical Fault Injection

Published: 15 June 2015 Publication History

Abstract

Soft errors are becoming an important issue in computing systems. Near threshold voltage (NTV), reduced circuit sizes, high performance computing (HPC), and high altitude computing all present interesting challenges in this area. Much of the existing literature has focused on hardware techniques to mitigate and measure soft errors at the hardware level. Instead, in this paper we explore the soft error susceptibility of three common sorting algorithms at the software layer. We focus on the comparison operator and use our software fault injection tool to place faults with fine precision during the execution of these algorithms. We explore how the algorithm susceptibilities vary based on input and bit position and relate these faults back to the source code to study how algorithmic decisions impact the reliability of the codes. Finally, we look at the question of the number of fault injections required for statistical significance. Using standard practice equations used in hardware fault injection experiments we calculate the number of injections that should be required to achieve confidence in our results. Then we show, empirically, that more fault injections are required before we gain confidence in our experiments.

References

[1]
clang: a C language family frontend for LLVM. http://clang.llvm.org/.
[2]
J. Aidemark, J. Vinter, P. Folkesson, and J. Karlsson. Goofi: generic object-oriented fault injection tool. In International Conference on Dependable Systems and Networks (DSN), 2001.
[3]
F. Bellard. Qemu, a fast and portable dynamic translator. In Proceedings of USENIX Annual Technical Conference (ATEC), 2005.
[4]
Z. Chen. Online-abft: an online algorithm based fault tolerance scheme for soft error detection in iterative methods. In Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming, (PPoPP), 2013.
[5]
N. DeBardeleben, S. Blanchard, Q. Guan, Z. Zhang, and S. Fu. Experimental framework for injecting logic errors in a virtual machine to profile applications for soft error resilience. In Euro-Par Workshops 2011.
[6]
D. Di Leo, F. Ayatolahi, B. Sangchoolie, J. Karlsson, and R. Johansson. On the impact of hardware faults -- an investigation of the relationship between workload inputs and failure mode distributions. In Proceedings of the 31st International Conference on Computer Safety, Reliability, and Security, SAFECOMP'12, 2012.
[7]
Q. Guan, N. DeBardeleben, S. Blanchard, and S. Fu. Fsefi: A fine-grained soft error fault injector for profiling application vulnerability. In 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014.
[8]
H. Kaul, M. Anders, S. Hsu, A. Agarwal, R. Krishnamurthy, and S. Borkar. Near-threshold voltage (ntv) design: opportunities and challenges. In Proceedings of the 49th Annual Design Automation Conference, (DAC), 2012.
[9]
R. Leveugle, A. Calvez, P. Maistri, and P. Vanhauwaert. Statistical fault injection: Quantified error and confidence. In DATE 2009.
[10]
S. Mukherjee, J. Emer, and S. Reinhardt. The soft error problem: an architectural perspective. In Proceedings of 11th International Symposium on High-Performance Computer Architecture, 2005. HPCA-11., pages 243--247, Feb 2005.
[11]
S. S. Mukherjee, C. Weaver, J. Emer, S. K. Reinhardt, and T. Austin. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003.
[12]
B. Sangchoolie, F. Ayatolahi, R. Johansson, and J. Karlsson. A study of the impact of bit-flip errors on programs compiled with different optimization levels. In Dependable Computing Conference (EDCC), 2014.
[13]
V. C. Sharma, A. Haran, Z. Rakamarić, and G. Gopalakrishnan. Towards formal approaches to system resilience. In Proceedings of the 19th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC), 2013.
[14]
M. Snir and et al. Addressing failures in exascale computing. ICiC workshop, 2013.
[15]
D. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. BitBlaze: A new approach to computer security via binary analysis. In Proceedings of International Conference on Information Systems Security (ICISS)., 2008.
[16]
A. Thomas and K. Pattabiraman. Llfi: An intermediate code level fault injector for soft computing applications. In Workshop on Silicon Errors in Logic System Effects (SELSE), 2013.
[17]
X. Vera, J. Abella, J. Carretero, and A. González. Selective replication: A lightweight technique for soft errors. ACM Trans. Comput. Syst., 27(4):8:1--8:30, Jan. 2010.

Cited By

View all
  • (2024)Understanding Silent Data Corruption in Processors for Mitigating its EffectsACM Transactions on Architecture and Code Optimization10.1145/369082521:4(1-27)Online publication date: 20-Nov-2024
  • (2024)Harpocrates: Breaking the Silence of CPU Faults through Hardware-in-the-Loop Program Generation2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00045(516-531)Online publication date: 29-Jun-2024
  • (2024)Druto: Upper-Bounding Silent Data Corruption Vulnerability in GPU Applications2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00058(582-594)Online publication date: 27-May-2024
  • Show More Cited By

Index Terms

  1. Empirical Studies of the Soft Error Susceptibility ofSorting Algorithms to Statistical Fault Injection

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FTXS '15: Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale
    June 2015
    78 pages
    ISBN:9781450335690
    DOI:10.1145/2751504
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 June 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fault injection
    2. resilience
    3. soft error
    4. sorting algorithms
    5. vulnerability

    Qualifiers

    • Short-paper

    Conference

    HPDC'15
    Sponsor:

    Acceptance Rates

    FTXS '15 Paper Acceptance Rate 9 of 15 submissions, 60%;
    Overall Acceptance Rate 16 of 25 submissions, 64%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Understanding Silent Data Corruption in Processors for Mitigating its EffectsACM Transactions on Architecture and Code Optimization10.1145/369082521:4(1-27)Online publication date: 20-Nov-2024
    • (2024)Harpocrates: Breaking the Silence of CPU Faults through Hardware-in-the-Loop Program Generation2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00045(516-531)Online publication date: 29-Jun-2024
    • (2024)Druto: Upper-Bounding Silent Data Corruption Vulnerability in GPU Applications2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00058(582-594)Online publication date: 27-May-2024
    • (2023)Silent Data Corruptions: Microarchitectural PerspectivesIEEE Transactions on Computers10.1109/TC.2023.328509472:11(3072-3085)Online publication date: Nov-2023
    • (2021)Cores that don't countProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3458336.3465297(9-16)Online publication date: 1-Jun-2021
    • (2017)Verifying Reliability Properties Using the Hyperball Abstract DomainACM Transactions on Programming Languages and Systems10.1145/315601740:1(1-29)Online publication date: 19-Dec-2017
    • (2017)On the Inherent Resilience of Integer OperationsEuro-Par 2016: Parallel Processing Workshops10.1007/978-3-319-58943-5_52(648-659)Online publication date: 28-May-2017
    • (2016)Design, Use and Evaluation of P-FSEFIProceedings of the 9th EAI International Conference on Simulation Tools and Techniques10.5555/3021426.3021429(9-17)Online publication date: 22-Aug-2016
    • (2015)Towards Building Resilient Scientific ApplicationsProceedings of the 2015 IEEE International Conference on Cluster Computing10.1109/CLUSTER.2015.35(176-179)Online publication date: 8-Sep-2015

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media