[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2931037.2931040acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Threats to the validity of mutation-based test assessment

Published: 18 July 2016 Publication History

Abstract

Much research on software testing and test techniques relies on experimental studies based on mutation testing. In this paper we reveal that such studies are vulnerable to a potential threat to validity, leading to possible Type I errors; incorrectly rejecting the Null Hypothesis. Our findings indicate that Type I errors occur, for arbitrary experiments that fail to take countermeasures, approximately 62% of the time. Clearly, a Type I error would potentially compromise any scientific conclusion. We show that the problem derives from such studies’ combined use of both subsuming and subsumed mutants. We collected articles published in the last two years at three leading software engineering conferences. Of those that use mutation-based test assessment, we found that 68% are vulnerable to this threat to validity.

References

[1]
Coccinelle: A program matching and transformation tool for systems code. http://coccinelle.lip6.fr/papers.php.
[2]
B. K. Aichernig, J. Auer, E. Jöbstl, R. Korosec, W. Krenn, R. Schlick, and B. V. Schmidt. Model-based mutation testing of an industrial measurement device. In Tests and Proofs - 8th International Conference TAP, pages 1–19, 2014.
[3]
P. Ammann, M. E. Delamaro, and J. Offutt. Establishing theoretical minimal sets of mutants. In IEEE International Conference on Software Testing, Verification and Validation, ICST, pages 21–30, 2014.
[4]
P. Ammann and J. Offutt. Introduction to software testing. Cambridge University Press, 2008.
[5]
J. H. Andrews, L. C. Briand, and Y. Labiche. Is Mutation an Appropriate Tool for Testing Experiments? In ICSE, pages 402 – 411, 2005.
[6]
J. H. Andrews, L. C. Briand, Y. Labiche, and A. S. Namin. Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria. IEEE Trans. Softw. Eng., 32(8):608–624, 2006.
[7]
S. Bardin, M. Delahaye, R. David, N. Kosmatov, M. Papadakis, Y. L. Traon, and J. Marion. Sound and quasi-complete detection of infeasible test requirements. In 8th IEEE International Conference on Software Testing, Verification and Validation, ICST 2015, pages 1–10, 2015.
[8]
T. A. Budd and D. Angluin. Two Notions of Correctness and Their Relation to Testing. Acta Informatica, 18(1):31–45, 1982.
[9]
M. Delahaye and L. du Bousquet. Selecting a software engineering tool: lessons learnt from mutation analysis. Softw., Pract. Exper., 45(7):875–891, 2015.
[10]
R. A. DeMillo, R. J. Lipton, and F. G. Sayward. Hints on test data selection: Help for the practicing programmer. Computer, 11(4):34–41, Apr. 1978.
[11]
X. Devroey, G. Perrouin, M. Papadakis, P.-Y. Schobbens, and P. Heymans. Featured Model-based Mutation Analysis. In International Conference on Software Engineering, ICSE, Austin, TX, USA, 2016.
[12]
H. Do, S. G. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering, 10(4):405–435, 2005.
[13]
H. Do and G. Rothermel. On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Trans. Software Eng., 32(9):733–752, 2006.
[14]
G. Fraser and A. Zeller. Mutation-driven generation of unit tests and oracles. IEEE Trans. Software Eng., 38(2):278–292, 2012.
[15]
M. Gligoric, A. Groce, C. Zhang, R. Sharma, M. A. Alipour, and D. Marinov. Guidelines for coverage-based comparisons of non-adequate test suites. ACM Trans. Softw. Eng. Methodol., 24(4):22, 2015.
[16]
M. Harman, Y. Jia, and W. B. Langdon. Strong higher order mutation-based test data generation. In 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering and 13rd European Software Engineering Conference, pages 212–222, 2011.
[17]
C. Henard, M. Papadakis, G. Perrouin, J. Klein, and Y. L. Traon. Assessing software product line testing via model-based mutation: An application to similarity testing. In IEEE International Conference on Software Testing, Verification and Validation, ICST Workshops Proceedings, pages 188–197, 2013.
[18]
R. M. Hierons, M. Harman, and S. Danicic. Using program slicing to assist in the detection of equivalent mutants. Softw. Test., Verif. Reliab., 9(4):233–262, 1999.
[19]
S. Hong, B. Lee, T. Kwak, Y. Jeon, B. Ko, Y. Kim, and M. Kim. Mutation-based fault localization for real-world multilingual programs (T). In 30th IEEE/ACM International Conference on Automated Software Engineering, ASE 2015, Lincoln, NE, USA, November 9-13, 2015, pages 464–475, 2015.
[20]
M. Hutchins, H. Foster, T. Goradia, and T. J. Ostrand. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In Proceedings of the 16th International Conference on Software Engineering, pages 191–200, 1994.
[21]
L. Inozemtseva and R. Holmes. Coverage is not strongly correlated with test suite effectiveness. In 36th International Conference on Software Engineering, ICSE ’14, Hyderabad, India - May 31 - June 07, 2014, pages 435–445, 2014.
[22]
Y. Jia and M. Harman. Higher Order Mutation Testing. Journal of Information and Software Technology, 51(10):1379–1393, October 2009.
[23]
Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. Software Engineering, IEEE Transactions on, 37(5):649 –678, sept.-oct. 2011.
[24]
R. Just, D. Jalali, L. Inozemtseva, M. D. Ernst, R. Holmes, and G. Fraser. Are mutants a valid substitute for real faults in software testing? In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 654–665, 2014.
[25]
R. Just, G. M. Kapfhammer, and F. Schweiggert. Do redundant mutants affect the effectiveness and efficiency of mutation analysis? In Fifth IEEE International Conference on Software Testing, Verification and Validation, ICST 2012, Montreal, QC, Canada, April 17-21, 2012, pages 720–725, 2012.
[26]
G. Kaminski, P. Ammann, and J. Offutt. Better predicate testing. In Proceedings of the 6th International Workshop on Automation of Software Test, AST, pages 57–63, 2011.
[27]
G. Kaminski, P. Ammann, and J. Offutt. Improving logic-based testing. Journal of Systems and Software, 86(8):2002–2012, 2013.
[28]
M. Kintis, M. Papadakis, and N. Malevris. Evaluating mutation testing alternatives: A collateral experiment. In APSEC, pages 300–309, 2010.
[29]
M. Kintis, M. Papadakis, and N. Malevris. Employing second-order mutation for isolating first-order equivalent mutants. Softw. Test., Verif. Reliab., 25(5-7):508–535, 2015.
[30]
B. Kurtz, P. Ammann, and J. Offutt. Static analysis of mutant subsumption. In Eighth IEEE International Conference on Software Testing, Verification and Validation, ICST 2015 Workshops, Graz, Austria, April 13-17, 2015, pages 1–10, 2015.
[31]
L. Madeyski, W. Orzeszyna, R. Torkar, and M. Jozala. Overcoming the equivalent mutant problem: A systematic literature review and a comparative experiment of second order mutation. IEEE Trans. Software Eng., 40(1):23–42, 2014.
[32]
A. S. Namin and J. H. Andrews. The influence of size and coverage on test suite effectiveness. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, Chicago, IL, USA, July 19-23, 2009, pages 57–68, 2009.
[33]
A. S. Namin, J. H. Andrews, and D. J. Murdoch. Sufficient mutation operators for measuring test effectiveness. In 30th International Conference on Software Engineering (ICSE 2008), Leipzig, Germany, May 10-18, 2008, pages 351–360, 2008.
[34]
A. S. Namin and S. Kakarla. The use of mutation in testing experiments and its sensitivity to external threats. In Proceedings of the 20th International Symposium on Software Testing and Analysis, ISSTA, pages 342–352, 2011.
[35]
A. J. Offutt. The Coupling Effect: Fact or Fiction. ACM SIGSOFT Software Engineering Notes, 14(8):131–140, December 1989.
[36]
A. J. Offutt, A. Lee, G. Rothermel, R. H. Untch, and C. Zapf. An Experimental Determination of Sufficient Mutant Operators. ACM T. Softw. Eng. Meth., 5(2):99–118, April 1996.
[37]
A. J. Offutt, J. Pan, K. Tewary, and T. Zhang. An Experimental Evaluation of Data Flow and Mutation Testing. Software Pract. Exper., 26(2):165–176, 1996.
[38]
A. J. Offutt and J. M. Voas. Subsumption of condition coverage techniques by mutation testing. 1996.
[39]
J. Offutt. A mutation carol: Past, present and future. Information & Software Technology, 53(10):1098–1107, 2011.
[40]
Y. Padioleau, J. L. Lawall, R. R. Hansen, and G. Muller. Documenting and automating collateral evolutions in linux device drivers. In Proceedings of the 2008 EuroSys Conference, Glasgow, Scotland, UK, April 1-4, 2008, pages 247–260, 2008.
[41]
M. Papadakis, C. Henard, and Y. L. Traon. Sampling program inputs with mutation analysis: Going beyond combinatorial interaction testing. In IEEE International Conference on Software Testing, Verification and Validation, ICST 2014, Cleveland, Ohio, USA, pages 1–10, 2014.
[42]
M. Papadakis, Y. Jia, M. Harman, and Y. L. Traon. Trivial compiler equivalence: A large scale empirical study of a simple, fast and effective equivalent mutant detection technique. In 37th IEEE/ACM International Conference on Software Engineering, ICSE 2015, Florence, Italy, May 16-24, 2015, Volume 1, pages 936–946, 2015.
[43]
M. Papadakis and N. Malevris. Automatic mutation test case generation via dynamic symbolic execution. In IEEE 21st International Symposium on Software Reliability Engineering, ISSRE 2010, San Jose, CA, USA, 1-4 November 2010, pages 121–130, 2010.
[44]
M. Papadakis and N. Malevris. An empirical evaluation of the first and second order mutation testing strategies. In Third International Conference on Software Testing, Verification and Validation, ICST 2010, Paris, France, April 7-9, 2010, Workshops Proceedings, pages 90–99, 2010.
[45]
M. Papadakis and N. Malevris. Mutation based test case generation via a path selection strategy. Information & Software Technology, 54(9):915–932, 2012.
[46]
M. Papadakis and Y. L. Traon. Metallaxis-fl: mutation-based fault localization. Softw. Test., Verif. Reliab., 25(5-7):605–628, 2015.
[47]
D. Schuler, V. Dallmeier, and A. Zeller. Efficient mutation testing by checking invariant violations. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, Chicago, IL, USA, July 19-23, 2009, pages 69–80, 2009.
[48]
A. Shi, A. Gyori, M. Gligoric, A. Zaytsev, and D. Marinov. Balancing trade-offs in test-suite reduction. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16 - 22, 2014, pages 246–256, 2014.
[49]
K. Tai. Predicate-based test generation for computer programs. In Proceedings of the 15th International Conference on Software Engineering, Baltimore, Maryland, USA, May 17-21, 1993., pages 267–276, 1993.
[50]
K.-C. Tai. Theory of Fault-based Predicate Testing for Computer Programs. IEEE Transactions on Software Engineering, 22(8):552–562, August 1996.
[51]
W. E. Wong and A. P. Mathur. Reducing the Cost of Mutation Testing: An Empirical Study. J. Syst. Software, 31(3):185–196, December 1995.
[52]
L. Zhang, S. Hou, J. Hu, T. Xie, and H. Mei. Is operator-based mutant selection superior to random mutant selection? In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, pages 435–444, 2010.
[53]
L. Zhang, D. Marinov, L. Zhang, and S. Khurshid. Regression mutation testing. In International Symposium on Software Testing and Analysis, ISSTA 2012, Minneapolis, MN, USA, July 15-20, 2012, pages 331–341, 2012.
[54]
H. Zhu, P. A. V. Hall, and J. H. R. May. Software unit test coverage and adequacy. ACM Comput. Surv., 29(4):366–427, 1997.

Cited By

View all
  • (2024)On the Coupling between Vulnerabilities and LLM-Generated Mutants: A Study on Vul4J Dataset2024 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST60714.2024.00035(305-316)Online publication date: 27-May-2024
  • (2024)Assessing the coverage of W-based conformance testing methods over code faultsScience of Computer Programming10.1016/j.scico.2024.103234(103234)Online publication date: Nov-2024
  • (2023)𝜇Akka: Mutation Testing for Actor Concurrency in Akka using Real-World BugsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616362(262-274)Online publication date: 30-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis
July 2016
452 pages
ISBN:9781450343909
DOI:10.1145/2931037
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Mutation testing
  2. subsuming mutants
  3. test assessment

Qualifiers

  • Research-article

Funding Sources

Conference

ISSTA '16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)On the Coupling between Vulnerabilities and LLM-Generated Mutants: A Study on Vul4J Dataset2024 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST60714.2024.00035(305-316)Online publication date: 27-May-2024
  • (2024)Assessing the coverage of W-based conformance testing methods over code faultsScience of Computer Programming10.1016/j.scico.2024.103234(103234)Online publication date: Nov-2024
  • (2023)𝜇Akka: Mutation Testing for Actor Concurrency in Akka using Real-World BugsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616362(262-274)Online publication date: 30-Nov-2023
  • (2023)Guiding Greybox Fuzzing with Mutation TestingProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598107(929-941)Online publication date: 12-Jul-2023
  • (2023)Exploring Better Black-Box Test Case Prioritization via Log AnalysisACM Transactions on Software Engineering and Methodology10.1145/356993232:3(1-32)Online publication date: 26-Apr-2023
  • (2023)iBiR: Bug-report-driven Fault InjectionACM Transactions on Software Engineering and Methodology10.1145/354294632:2(1-31)Online publication date: 30-Mar-2023
  • (2023)Mutation Testing in Evolving Systems: Studying the Relevance of Mutants to Code EvolutionACM Transactions on Software Engineering and Methodology10.1145/353078632:1(1-39)Online publication date: 13-Feb-2023
  • (2023)Syntactic Versus Semantic Similarity of Artificial and Real Faults in Mutation Testing StudiesIEEE Transactions on Software Engineering10.1109/TSE.2023.327756449:7(3922-3938)Online publication date: 1-Jul-2023
  • (2023)Data-Driven Mutation Analysis for Cyber-Physical SystemsIEEE Transactions on Software Engineering10.1109/TSE.2022.321304149:4(2182-2201)Online publication date: 1-Apr-2023
  • (2023)Cerebro: Static Subsuming Mutant SelectionIEEE Transactions on Software Engineering10.1109/TSE.2022.314051049:1(24-43)Online publication date: 1-Jan-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media