More Web Proxy on the site http://driver.im/

survey

Assessing Dependability with Software Fault Injection: A Survey

Authors:

Roberto Natella,

Domenico Cotroneo,

Henrique S. MadeiraAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 48, Issue 3

Article No.: 44, Pages 1 - 55

https://doi.org/10.1145/2841425

Published: 08 February 2016 Publication History

Abstract

With the rise of software complexity, software-related accidents represent a significant threat for computer-based systems. Software Fault Injection is a method to anticipate worst-case scenarios caused by faulty software through the deliberate injection of software faults. This survey provides a comprehensive overview of the state of the art on Software Fault Injection to support researchers and practitioners in the selection of the approach that best fits their dependability assessment goals, and it discusses how these approaches have evolved to achieve fault representativeness, efficiency, and usability. The survey includes a description of relevant applications of Software Fault Injection in the context of fault-tolerant systems.

References

[1]

J. Aidemark, J. Vinter, P. Folkesson, and J. Karlsson. 2001. GOOFI: Generic object-oriented fault injection tool. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 83--88.

Digital Library

[2]

A. Albinet, J. Arlat, and J. C. Fabre. 2004. Characterization of the impact of faulty drivers on the robustness of the linux kernel. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 867--876.

Digital Library

[3]

AMBER project. 2009. AMBER Final Research Roadmap. Retrieved from http://www.amber-project.eu/.

[4]

J. H. Andrews, L. C. Briand, and Y. Labiche. 2005. Is mutation an appropriate tool for testing experiments? In Proc. Intl. Conf. on Software Engineering. 402--411.

Digital Library

[5]

J. Arlat, M. Aguera, L. Amat, Y. Crouzet, J. C. Fabre, J. C. Laprie, E. Martins, and D. Powell. 1990. Fault injection for dependability validation: A methodology and some applications. IEEE Trans. Software Eng. 16, 2 (1990), 166--182.

Digital Library

[6]

J. Arlat, Y. Crouzet, J. Karlsson, P. Folkesson, E. Fuchs, and G. H. Leber. 2003. Comparison of physical and software-implemented fault injection techniques. IEEE Trans. Comput. 52, 9 (2003), 1115--1133.

Digital Library

[7]

J. Arlat, J. C. Fabre, M. Rodríguez, and F. Salles. 2002. Dependability of COTS microkernel-based systems. IEEE Trans. Comput. 51, 2 (2002), 138--163.

Digital Library

[8]

J. Arlat and R. Moraes. 2011. Collecting, analyzing and archiving results from fault injection experiments. In Proc. Latin-American Symposium on Dependable Computing. 100--105.

[9]

A. Avizienis, J. C. Laprie, B. Randell, and C. Landwehr. 2004. Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. on Dependable and Secure Computing 1, 1 (2004), 11--33.

Digital Library

[10]

D. Avresky, J. Arlat, J. C. Laprie, and Y. Crouzet. 1996. Fault injection for formal testing of fault tolerance. IEEE Trans. on Reliability 45, 3 (1996), 443--455.

[11]

R. Banabic and G. Candea. 2012. Fast black-box testing of system recovery code. In Proc. ACM European Conference on Computer Systems. 281--294.

Digital Library

[12]

R. Barbosa, J. Vinter, P. Folkesson, and J. Karlsson. 2005. Assembly-level pre-injection analysis for improving fault injection efficiency. In Proc. European Dependable Computing Conf. 246--262.

Digital Library

[13]

J. H. Barton, E. W. Czeck, Z. Z. Segall, and D. P. Siewiorek. 1990. Fault injection experiments using FIAT. IEEE Trans. Comput. 39, 4 (1990), 575--582.

Digital Library

[14]

T. Basso, R. Moraes, B. P. Sanches, and M. Jino. 2009. An investigation of java faults operators derived from a field data study on Java software faults. In Workshop de Testes e Tolerância a Falhas.

[15]

A. Bondavalli, S. Chiaradonna, D. Cotroneo, and L. Romano. 2004. Effective fault treatment for improving the dependability of COTS and legacy-based applications. IEEE Trans. Dependable Secure Comput. 1, 4 (2004), 223--237.

Digital Library

[16]

E. Bounimova, P. Godefroid, and D. Molnar. 2013. Billions and billions of constraints: Whitebox fuzz testing in production. In Proc. Intl. Conf. on Software Engineering. 122--131.

Digital Library

[17]

P. Broadwell, N. Sastry, and J. Traupman. 2002. FIG: A prototype tool for online verification of recovery mechanisms. In Workshop on Self-Healing, Adaptive and self-MANaged Systems.

[18]

G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox. 2004. Microreboot--A technique for cheap recovery. In Proc. Symp. on Operating Systems Design and Implementation.

Digital Library

[19]

J. V. Carreira, D. Costa, and J. G. Silva. 1999. Fault injection spot-checks computer system dependability. IEEE Spectrum 36, 8 (1999), 50--55.

Digital Library

[20]

J. Carreira, H. Madeira, and J. G. Silva. 1998. Xception: A technique for the experimental evaluation of dependability in modern computers. IEEE Trans. Software Eng. 24, 2 (1998), 125--136.

Digital Library

[21]

R. Chandra, R. M. Lefever, K. R. Joshi, M. Cukier, and W. H. Sanders. 2004. A global-state-triggered fault injector for distributed system evaluation. IEEE Trans. Parallel Distrib. Syst. 15, 7 (2004), 593--605.

Digital Library

[22]

S. Chandra and P. M. Chen. 1998. How fail-stop are faulty programs? In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 240--249.

Digital Library

[23]

R. Chillarege, I. S. Bhandari, J. K. Chaar, M. J. Halliday, D. S. Moebus, B. K. Ray, and M. Y. Wong. 1992. Orthogonal defect classification--A concept for in-process measurements. IEEE Trans. Software Eng. 18, 11 (1992), 943--956.

Digital Library

[24]

R. Chillarege, W. L. Kao, and R. G. Condit. 1991. Defect type and its impact on the growth curve. In Proc. Intl. Conf. on Software Engineering. 246--255.

Digital Library

[25]

J. Christmansson and R. Chillarege. 1996. Generation of an error set that emulates software faults based on field data. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 304--313.

Digital Library

[26]

J. Christmansson, M. Hiller, and M. Rimen. 1998. An experimental comparison of fault and error injection. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 369--378.

Digital Library

[27]

J. Christmansson and P. Santhanam. 1996. Error injection aimed at fault removal in fault tolerance mechanisms--Criteria for error selection using field data on software faults. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 175--184.

Digital Library

[28]

J. A. Clark and D. K. Pradhan. 1995. Fault injection: A method for validating computer-system dependability. IEEE Computer 28, 6 (1995), 47--56.

Digital Library

[29]

D. Cotroneo, D. Di Leo, F. Fucci, and R. Natella. 2013. SABRINE: State-based robustness testing of operating systems. In Proc. IEEE/ACM Intl. Conf. on Automated Software Engineering. 125--135.

[30]

D. Cotroneo, A. Lanzaro, R. Natella, and R. Barbosa. 2012. Experimental analysis of binary-level software fault injection in complex software. In Proc. European Dependable Computing Conf. 162--172.

Digital Library

[31]

D. Cotroneo, R. Natella, S. Russo, and F. Scippacercola. 2013. State-driven testing of distributed systems. In Proc. Intl. Conf. Principles of Distributed Systems. 114--128.

Digital Library

[32]

M. Daran and P. Thévenod-Fosse. 1996. Software error analysis: A real case study involving real faults and mutations. ACM Software Engineering Notes 21, 3 (1996), 158--171.

Digital Library

[33]

S. Dawson, F. Jahanian, T. Mitton, and T. L. Tung. 1996. Testing of fault-tolerant and real-time distributed systems via protocol fault injection. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 404--414.

Digital Library

[34]

DBench project. 2004. DBench Final Report. Retrieved from http://www.laas.fr/DBench/.

[35]

V. De Florio and C. Blondia. 2008. A survey of linguistic structures for application-level fault tolerance. Comput. Surveys 40, 2 (2008), 6.

Digital Library

[36]

M. E. Delamaro and J. C. Maldonado. 1996. Proteum—A tool for the assessment of test adequacy for c programs. In Proc. Conf. Performability in Computer Systems. 79--95.

[37]

R. A. DeMillo, R. J. Lipton, and F. G. Sayward. 1978. Hints on test data selection: Help for the practicing programmer. IEEE Computer 11, 4 (1978), 34--41.

Digital Library

[38]

C. P. Dingman and J. Marshall. 1995. Measuring robustness of a fault-tolerant aerospace system. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 522--527.

Digital Library

[39]

H. Do and G. Rothermel. 2006. On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Trans. Software Eng. (2006), 733--752.

Digital Library

[40]

J. Duraes and H. Madeira. 2002. Emulation of software faults by educated mutations at machine-code level. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 329--340.

Digital Library

[41]

J. Durães and H. Madeira. 2006. Emulation of software faults: A field data study and a practical approach. IEEE Trans. Software Eng. 32, 11 (2006), 849--867.

Digital Library

[42]

J. Durães, M. Vieira, and H. Madeira. 2003. Multidimensional characterization of the impact of faulty drivers on the operating systems behavior. IEICE Trans. Inf. Sys. 86, 12 (2003), 2563--2570.

[43]

J. Durães, M. Vieira, and H. Madeira. 2004. Dependability benchmarking of Web-servers. In Proc. Intl. Conf. on Computer Safety, Reliability, and Security. 297--310.

[44]

N. E. Fenton and N. Ohlsson. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Software Eng. 26, 8 (2000), 797--814.

Digital Library

[45]

C. Fetzer, P. Felber, and K. Högstedt. 2004. Automatic detection and masking of nonatomic exception handling. IEEE Trans. Software Eng. 30 (2004), 547--560. Issue 8.

Digital Library

[46]

C. Fetzer and Z. Xiao. 2002. An automated approach to increasing the robustness of C libraries. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 155--164.

Digital Library

[47]

J. Fonseca and M. Vieira. 2008. Mapping software faults with web security vulnerabilities. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 257--266.

[48]

J. Fonseca, M. Vieira, and H. Madeira. 2009. Vulnerability & attack injection for Web applications. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 93--102.

[49]

A. G. Ganek and T. A. Corbi. 2003. The dawning of the autonomic computing era. IBM Syst. J. 42, 1 (2003), 5--18.

Digital Library

[50]

A. K. Ghosh, M. Schmid, and V. Shah. 1998. Testing the robustness of windows NT software. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 231--235.

Digital Library

[51]

C. Giuffrida, A. Kuijsten, and A. S. Tanenbaum. 2013. EDFI: A dependable fault injection tool for dependability benchmarking experiments. In Proc. Pacific Rim Intl. Symp. on Dependable Computing.

Digital Library

[52]

P. Godefroid, M. Y. Levin, and D. A. Molnar. 2008. Automated whitebox fuzz testing. In Proc. Network and Distributed Sys. Sec. Symp. 151--166.

[53]

A. Gorla, M. Pezzè, J. Wuttke, L. Mariani, and F. Pastore. 2012. Achieving cost-effective software reliability through self-healing. Comput. Inf. 29, 1 (2012), 93--115.

[54]

J. Gray. 1990. A census of tandem system availability between 1985 and 1990. IEEE Trans. on Reliability 39, 4 (1990), 409--418.

[55]

H. S. Gunawi, T. Do, P. Joshi, P. Alvaro, J. M. Hellerstein, A. C. Arpaci-Dusseau, R. H. Arpaci-Dusseau, K. Sen, and D. Borthakur. 2011. FATE and DESTINI: A framework for cloud recovery testing. In Proc. USENIX Symposium on Networked Systems Design and Implementation.

Digital Library

[56]

R. G. Hamlet. 1977. Testing programs with the aid of a compiler. IEEE Trans. Software Eng. 3, 4 (1977), 279--290.

Digital Library

[57]

S. Han, K. G. Shin, and H. A. Rosenberg. 1995. DOCTOR: An integrated software fault injection environment. In Proc. Intl. Computer Performance and Dependability Symp. 204--213.

Digital Library

[58]

M. Hiller, A. Jhumka, and N. Suri. 2001. An approach for analysing the propagation of data errors in software. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 161--170.

Digital Library

[59]

M. C. Hsueh, T. K. Tsai, and R. K. Iyer. 1997. Fault injection techniques and tools. IEEE Computer 30, 4 (1997), 75--82.

Digital Library

[60]

J. J. Hudak, B. H. Suh, D. P. Siewiorek, and Z. Segall. 1993. Evaluation and comparison of fault-tolerant software techniques. IEEE Trans. Reliability 42, 2 (1993), 190--204.

[61]

IEEE. 1990. IEEE standard glossary of software engineering terminology. IEEE Std 610.12-1990 (1990).

[62]

IEEE. 1994. IEEE standard for information technology--Portable operating system interface (POSIX) part 1. IEEE Std 1003.1b-1993 (1994).

[63]

ISO. 2011. Product development: software level. ISO 26262: Road vehicles -- Functional safety 6 (2011).

[64]

T. Jarboui, J. Arlat, Y. Crouzet, K. Kanoun, and T. Marteau. 2002. Analysis of the effects of real and injected software faults: Linux as a case study. In Proc. Pacific Rim Intl. Symp. on Dependable Computing. 51--58.

Digital Library

[65]

Y. Jia and M. Harman. 2009. Higher order mutation testing. Inf. Software Technol. 51, 10 (2009), 1379--1393.

Digital Library

[66]

Y. Jia and M. Harman. 2011. An analysis and survey of the development of mutation testing. IEEE Trans. Software Eng. 37, 5 (2011), 649--678.

Digital Library

[67]

A. Johansson and N. Suri. 2005. Error propagation profiling of operating systems. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 86--95.

Digital Library

[68]

A. Johansson, N. Suri, and B. Murphy. 2007a. On the impact of injection triggers for OS robustness evaluation. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 127--126.

Digital Library

[69]

A. Johansson, N. Suri, and B. Murphy. 2007b. On the selection of error model(s) for OS robustness evaluation. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 502--511.

Digital Library

[70]

P. Joshi, H. S. Gunawi, and K. Sen. 2011. PREFAIL: A programmable tool for multiple-failure injection. ACM SIGPLAN Not. 46, 10 (2011), 171--188.

Digital Library

[71]

A. Kalakech, K. Kanoun, Y. Crouzet, and J. Arlat. 2004. Benchmarking the dependability of windows NT4, 2000 and XP. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 681--686.

Digital Library

[72]

G. A. Kanawati, N. A. Kanawati, and J. A. Abraham. 1995. FERRARI: A flexible software-based fault and error injection system. IEEE Trans. Comput. 44, 2 (1995), 248--260.

Digital Library

[73]

K. Kanoun and L. Spainhower. 2008. Dependability Benchmarking for Computer Systems. Wiley-IEEE Computer Society.

Digital Library

[74]

W.-I. Kao and R. K. Iyer. 1994. DEFINE: A distributed fault injection and monitoring environment. In Proc. Workshop on Fault-Tolerant Parallel and Distributed Systems. 252--259.

[75]

W.-I. Kao, R. K. Iyer, and D. Tang. 1993. FINE: A fault injection and monitoring environment for tracing the UNIX system behavior under faults. IEEE Trans. Software Eng. 19, 11 (1993), 1105--1118.

Digital Library

[76]

J. Katcher. 1997. Postmark: A New File System Benchmark. Technical Report TR-3022.

[77]

L. Keller, P. Upadhyaya, and G. Candea. 2008. ConfErr: A tool for assessing resilience to human configuration errors. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 157--166.

[78]

J. C. King. 1976. Symbolic execution and program testing. Commun. ACM 19, 7 (1976), 385--394.

Digital Library

[79]

K. N. King and A. J. Offutt. 1991. A fortran language system for mutation-based software testing. Software: Practice Exp. 21, 7 (1991), 685--718.

Digital Library

[80]

P. Koopman and J. DeVale. 2000. The exception handling effectiveness of POSIX operating systems. IEEE Trans. Software Eng. 26, 9 (2000), 837--848.

Digital Library

[81]

A. Lanzaro, R. Natella, S. Winter, D. Cotroneo, and N. Suri. 2014. An empirical study of injected versus actual interface errors. In Proc. Intl. Symp. Soft. Testing and Analysis. 397--408.

Digital Library

[82]

J.-C. Laprie, J. Arlat, C. Beounes, and K. Kanoun. 1990. Definition and analysis of hardware-and software-fault-tolerant architectures. IEEE Computer 23, 7 (1990), 39--51.

Digital Library

[83]

N. Laranjeiro, M. Vieira, and H. Madeira. 2014. A technique for deploying robust Web services. IEEE Trans. Services Comput. 7, 1 (2014), 68--81.

Digital Library

[84]

I. Lee and R. K. Iyer. 1995. Software dependability in the tandem guardian system. IEEE Trans. Software Eng. 21, 5 (1995), 455--467.

Digital Library

[85]

N. G. Leveson. 2004. Role of software in spacecraft accidents. J. Spacecraft Rockets 41, 4 (2004), 564--575.

[86]

X. Li, M. C. Huang, K. Shen, and L. Chu. 2010. A realistic evaluation of memory hardware errors and software system susceptibility. In Proc. USENIX Annual Technical Conf.

Digital Library

[87]

M. R. Lyu. 1995. Software Fault Tolerance. John Wiley & Sons.

Digital Library

[88]

H. Madeira, D. Costa, and M. Vieira. 2000. On the emulation of software faults by software fault injection. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 417--426.

Digital Library

[89]

L. Madeyski, W. Orzeszyna, R. Torkar, and M. Józala. 2014. Overcoming the equivalent mutant problem: A systematic literature review and a comparative experiment of second order mutation. IEEE Trans. Software Eng. 40, 1 (2014), 23--42.

Digital Library

[90]

A. Mahmood, D. M. Andrews, and EJ McCluskey. 1984. Executable assertions and flight software. In Proceedings of the 6th Digital Avionics Systems Conference. 346--351.

[91]

P. D. Marinescu and G. Candea. 2011. Efficient testing of recovery code using fault injection. ACM Trans. Comput. Syst. 29, 4 (2011), 11:1--11:38.

Digital Library

[92]

E. Martins, C. M. F. Rubira, and N. G. M. Leme. 2002. Jaca: A reflective fault injection tool based on patterns. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 483--487.

Digital Library

[93]

P. A. McQuaid. 2012. Software disasters—understanding the past, to improve the future. J. Software: Evolution and Process 24, 5 (2012), 459--470.

[94]

Microsoft Corp. 2014. Resilience by Design for Cloud Services. Retrieved from http://www.microsoft.com/en-us/download/details.aspx?id=38823.

[95]

B. P. Miller, L. Fredriksen, and B. So. 1990. An empirical study of the reliability of UNIX utilities. Commun. ACM 33, 12 (1990), 32--44.

Digital Library

[96]

B. Miller, D. Koski, C. Lee, V. Maganty, R. Murthy, A. Natarajan, and J. Steidl. 1998. Fuzz Revisited: A Re-examination of the Reliability of UNIX Utilities and Services. Technical Report CSTR-95-1268.

[97]

R. Moraes, R. Barbosa, J. Durães, N. Mendes, E. Martins, and H. Madeira. 2006. Injection of faults at component interfaces and inside the component code: Are they equivalent? In Proc. European Dependable Computing Conf. 53--64.

Digital Library

[98]

V. Nagarajan, D. Jeffrey, and R. Gupta. 2009. Self-recovery in server programs. In Proc. Intl. Symp. on Memory Management. 49--58.

Digital Library

[99]

NASA. 2004. NASA software safety guidebook. NASA-GB-8719.13 (2004).

[100]

R. Natella and D. Cotroneo. 2010. Emulation of transient software faults for dependability assessment: A case study. In Proc. European Dependable Computing Conf. 23--32.

Digital Library

[101]

R. Natella, D. Cotroneo, J. A. Duraes, and H. Madeira. 2013. On fault representativeness of software fault injection. IEEE Trans. Software Eng. 39, 1 (2013), 80--96.

Digital Library

[102]

W. T. Ng, C. M. Aycock, G. Rajamani, and P. M. Chen. 1996. Comparing disk and memory’s resistance to operating system crashes. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 185--194.

Digital Library

[103]

W. T. Ng and P. M. Chen. 2001. The design and verification of the rio file cache. IEEE Trans. Comput. 50, 4 (2001), 322--337.

Digital Library

[104]

A. J. Offutt. 1992. Investigations of the software testing coupling effect. ACM Trans. Software Eng Methodol. 1, 1 (1992), 5--20.

Digital Library

[105]

A. J. Offutt, A. Lee, G. Rothermel, R. H. Untch, and C. Zapf. 1996. An experimental determination of sufficient mutant operators. ACM Trans. Software Eng Methodol. 5, 2 (1996), 99--118.

Digital Library

[106]

D. Oppenheimer, A. Ganapathi, and D. A. Patterson. 2003. Why do internet services fail, and what can be done about it? In USENIX Symp. on Internet Technologies and Systems.

Digital Library

[107]

T. J. Ostrand, E. J. Weyuker, and R. M. Bell. 2005. Predicting the location and number of faults in large software systems. IEEE Trans. Software Eng. 31, 4 (2005), 340--355.

Digital Library

[108]

M. Papadakis and N. Malevris. 2010. An empirical evaluation of the first and second order mutation testing strategies. In Proc. Intl. Conf. Software Testing, Verification, and Validation Workshops. 90--99.

Digital Library

[109]

C. S. Păsăreanu and W. Visser. 2009. A survey of new trends in symbolic execution for software testing and analysis. Intl. J. Software Tools Tech. Transf. 11, 4 (2009), 339--353.

Digital Library

[110]

K. Pattabiraman, N. Nakka, Z. Kalbarczyk, and R. K. Iyer. 2008. SymPLFIED: symbolic program-level fault injection and error detection framework. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 472--481.

[111]

D. Patterson, A. Brown, P. Broadwell, G. Candea, M. Chen, J. Cutler, P. Enriquez, A. Fox, E. Kiciman, M. Merzbacher, D. Oppenheimer, N. Sastry, W. Tetzlaff, J. Traupman, and N. Treuhaft. 2002. Recovery-Oriented Computing (ROC): Motivation, Definition, Techniques, and Case Studies. Technical Report TR-02-1175.

Digital Library

[112]

D. Powell, E. Martins, J. Arlat, and Y. Crouzet. 1995. Estimators for fault tolerance coverage evaluation. IEEE Trans. Comput. 44, 2 (1995), 261--274.

Digital Library

[113]

V. Prabhakaran, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. 2005. Model-based failure analysis of journaling file systems. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 802--811.

Digital Library

[114]

G. L. Ries, G. S. Choi, and R. K. Iyer. 1994. Device-level transient fault modeling. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 86--94.

[115]

RTCA. 1992. DO-178B software considerations in airborne systems and equipment certification. Requirements and Technical Concepts for Aviation (1992).

[116]

F. Salfner, M. Lenk, and M. Malek. 2010. A survey of online failure prediction methods. Comput. Surveys 42, 3 (2010), 10.

Digital Library

[117]

F. Salles, M. Rodriguez, J.-C. Fabre, and J. Arlat. 1999. MetaKernels and fault containment wrappers. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 22--29.

Digital Library

[118]

B. P. Sanches, T. Basso, and R. Moraes. 2011. J-SWFIT: A Java software fault injection tool. In Proc. Latin American Symp. on Dependable Computing.

[119]

A. Schiper, K. Birman, and P. Stephenson. 1991. Lightweight causal and atomic group multicast. ACM Trans. Comput. Syst. 9, 3 (1991), 272--314.

Digital Library

[120]

SPEC. 2000. SPECweb99 v1.02. Retrieved from http://www.spec.org/web99/.

[121]

M. Sridharan and A. S. Namin. 2010. Prioritizing mutation operators based on importance sampling. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 378--387.

Digital Library

[122]

D. T. Stott, B. Floering, Z. Kalbarczyk, and R. K. Iyer. 2000. A framework for assessing dependability in distributed systems with lightweight fault injectors. In Proc. Intl. Computer Performance and Dependability Symp. 91--100.

Digital Library

[123]

R. Strom and S. Yemini. 1985. Optimistic recovery in distributed systems. ACM Trans. Comput. Syst. 3, 3 (1985), 204--226.

Digital Library

[124]

M. Sullivan and R. Chillarege. 1991. Software defects and their impact on system availability: A study of field failures in operating systems. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 2--9.

[125]

N. Suri and P. Sinha. 1998. On the use of formal techniques for validation. In Digest of Papers, Intl. Symp. on Fault-Tolerant Computing. 390--399.

Digital Library

[126]

M. Susskraut and C. Fetzer. 2006. Automatically finding and patching bad error handling. In Proc. European Dependable Computing Conf. 13--22.

Digital Library

[127]

A. Thakur, R. K. Iyer, L. Young, and I. Lee. 1995. Analysis of failures in the tandem nonstop-UX operating system. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 40--50.

[128]

TPCC. 2010. TPC Benchmark C (TPC-C) v5.11. Retrieved from http://www.tpc.org/tpcc/.

[129]

T. K. Tsai, M. C. Hsueh, H. Zhao, Z. Kalbarczyk, and R. K. Iyer. 1999. Stress-based and path-based fault injection. IEEE Trans. Comput. 48, 11 (1999), 1183--1201.

Digital Library

[130]

E. van der Kouwe, C. Giuffrida, and A. S. Tanenbaum. 2014. Evaluating distortion in fault injection experiments. In Proc. IEEE Intl. Symp. High-Assurance Systems Engineering. 25--32.

Digital Library

[131]

P. C. Véras, E. Villani, A. M. Ambrosio, N. Silva, M. Vieira, and H. Madeira. 2012. Errors on space software requirements: A field study and application scenarios. In Proc. IEEE Intl. Symp. on Software Reliability Engineering. 61--70.

Digital Library

[132]

M. Vieira and H. Madeira. 2003. A dependability benchmark for OLTP application environments. In Proc. Intl. Conf. on Very Large Data Bases. 742--753.

Digital Library

[133]

M. Vieira, H. Madeira, I. Irrera, and M. Malek. 2009. Fault injection for failure prediction methods validation. In Proc. Workshop on Hot Topics in System Dependability.

[134]

J. M. Voas. 1998. Certifying off-the-shelf software components. IEEE Computer 31, 6 (1998), 53--59.

Digital Library

[135]

J. M. Voas, F. Charron, G. McGraw, K. Miller, and M. Friedman. 1997. Predicting how badly “Good” software can behave. IEEE Software 14, 4 (1997), 73--83.

Digital Library

[136]

C. J. Walter and N. Suri. 2003. The customizable fault/error model for dependable distributed systems. Theor. Comput. Sci. 290, 2 (2003), 1223--1251.

Digital Library

[137]

E. J. Weyuker. 1998. Testing component-based software: A cautionary tale. IEEE Software 15, 5 (1998), 54--59.

Digital Library

[138]

L. Wilson. 2013. International Technology Roadmap for Semiconductors. Retrieved from http://www.itrs.net.

[139]

S. Winter, C. Sârbu, N. Suri, and B. Murphy. 2011. The impact of fault models on software robustness evaluations. In Proc. Intl. Conf. on Software Engineering. 51--60.

Digital Library

[140]

S. Winter, M. Tretter, B. Sattler, and N. Suri. 2013. simFI: From single to simultaneous software fault injections. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks.

Digital Library

[141]

C. Wohlin, P. Runeson, M. Höst, M. C. Ohlsson, B. Regnell, and A. Wesslén. 2012. Experimentation in Software Engineering. Springer.

Digital Library

[142]

W. E. Wong and A. P. Mathur. 1995. Reducing the cost of mutation testing: An empirical study. J. Syst. Software 31, 3 (1995), 185--196.

Digital Library

[143]

J. Xu, S. Chen, Z. Kalbarczyk, and R. K. Iyer. 2001. An experimental study of security vulnerabilities caused by errors. In Proc. IEEE/IFIP Intl. Conf. Dependable Systems and Networks. 421--430.

Digital Library

Cited By

Dai YLiu SLiu H(2025)Mutation-Based Approach to Supporting Human–Machine Pair InspectionElectronics10.3390/electronics1402038214:2(382)Online publication date: 19-Jan-2025
https://doi.org/10.3390/electronics14020382
Amyan AAbboush MKnieke CRausch A(2024)Automating Fault Test Cases Generation and Execution for Automotive Safety Validation via NLP and HIL SimulationSensors10.3390/s2410314524:10(3145)Online publication date: 15-May-2024
https://doi.org/10.3390/s24103145
Rozsíval MChristakis MPradel M(2024)Automated Testing of Networked Systems ReliabilityProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3685559(1920-1922)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3685559
Show More Cited By

Index Terms

Assessing Dependability with Software Fault Injection: A Survey

Recommendations

Fault Injection for Software Certification

As software becomes more pervasive and complex, it's increasingly important to ensure that a system will be safe even in the presence of residual software faults (or bugs). Software fault injection consists of the deliberate introduction of software ...
Emulation of Transient Software Faults for Dependability Assessment: A Case Study
EDCC '10: Proceedings of the 2010 European Dependable Computing Conference

Fault Tolerance Mechanisms (FTMs) are extensively used in software systems to counteract software faults, in particular against faults that manifest transiently, namely Mandelbugs. In this scenario, Software Fault Injection (SFI) plays a key role for ...
A Framework for Assessing Dependability in Distributed Systems with Lightweight Fault Injectors
IPDS '00: Proceedings of the 4th International Computer Performance and Dependability Symposium

Many fault injection tools are available for dependability assessment. Although these tools are good at injecting a single fault model into a single system, they suffer from two main limitations for use in distributed systems: (1) no single tool is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 48, Issue 3

February 2016

619 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/2856149

Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering/University of Florida/Gainesville

Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 February 2016

Accepted: 01 October 2015

Revised: 01 July 2015

Received: 01 June 2013

Published in CSUR Volume 48, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Research
Refereed

Funding Sources

Italian Ministry of Education
University and Research
CECRIS FP7 project
COSMIC public-private laboratory

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

141
Total Citations
View Citations
2,206
Total Downloads

Downloads (Last 12 months)264
Downloads (Last 6 weeks)16

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dai YLiu SLiu H(2025)Mutation-Based Approach to Supporting Human–Machine Pair InspectionElectronics10.3390/electronics1402038214:2(382)Online publication date: 19-Jan-2025
https://doi.org/10.3390/electronics14020382
Amyan AAbboush MKnieke CRausch A(2024)Automating Fault Test Cases Generation and Execution for Automotive Safety Validation via NLP and HIL SimulationSensors10.3390/s2410314524:10(3145)Online publication date: 15-May-2024
https://doi.org/10.3390/s24103145
Rozsíval MChristakis MPradel M(2024)Automated Testing of Networked Systems ReliabilityProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3685559(1920-1922)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3685559
Sharma STanksalkar SCherupattamoolayil SMachiry AQuek TGao DZhou JCardenas A(2024)Fuzzing API Error Handling Behaviors using Coverage Guided Fault InjectionProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637650(1495-1509)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3637650
Rønnestad ACeccarelli AMontecchi LHong JPark J(2024)Validation of Safety Metrics for Object Detectors in Autonomous DrivingProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3635907(1559-1568)Online publication date: 8-Apr-2024
https://dl.acm.org/doi/10.1145/3605098.3635907
Lu ZZhang WXu HJiang J(2024)A Reliability Benchmarking Method for Linux2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C)10.1109/QRS-C63300.2024.00033(177-186)Online publication date: 1-Jul-2024
https://doi.org/10.1109/QRS-C63300.2024.00033
Li R(2024)Software Quality Testing Framework based on Machine Learning Analysis2024 5th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI)10.1109/ICMCSI61536.2024.00063(396-401)Online publication date: 18-Jan-2024
https://doi.org/10.1109/ICMCSI61536.2024.00063
Barletta MCinque MDi Martino CKalbarczyk ZIyer R(2024)Mutiny! How Does Kubernetes Fail, and What Can We Do About It?2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00016(1-14)Online publication date: 24-Jun-2024
https://doi.org/10.1109/DSN58291.2024.00016
Cotroneo DLiguori P(2024)Neural Fault Injection: Generating Software Faults from Natural Language2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S)10.1109/DSN-S60304.2024.00016(23-27)Online publication date: 24-Jun-2024
https://doi.org/10.1109/DSN-S60304.2024.00016
Smit TForlin BChen KSouvatzoglou IPsarakis MOttavi M(2024)An Enhanced Fault Injection Framework for FPGA-Based Soft-Cores2024 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)10.1109/DFT63277.2024.10753564(1-6)Online publication date: 8-Oct-2024
https://doi.org/10.1109/DFT63277.2024.10753564
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents