[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/QSIC.2014.33guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Empirically Evaluating the Quality of Automatically Generated and Manually Written Test Suites

Published: 02 October 2014 Publication History

Abstract

The creation, execution, and maintenance of tests are some of the most expensive tasks in software development. To help reduce the cost, automated test generation tools can be used to assist and guide developers in creating test cases. Yet, the tests that automated tools produce range from simple skeletons to fully executable test suites, hence their complexity and quality vary. This paper compares the complexity and quality of test suites created by sophisticated automated test generation tools to that of developer-written test suites. The empirical study in this paper examines ten real-world programs with existing test suites and applies two state-of-the-art automated test generation tools. The study measures the resulting test suite quality in terms of code coverage and fault-finding capability. On average, manual tests covered 31.5% of the branches while the automated tools covered 31.8% of the branches. In terms of mutation score, the tests generated by automated tools had an average mutation score of 39.8% compared to the average mutation score of 42.1% for manually written tests. Even though automatically created tests often contain more lines of source code than those written by developers, this paper's empirical results reveal that test generation tools can provide value by creating high quality test suites while reducing the cost and effort needed for testing.

Cited By

View all
  • (2023)NaNofuzz: A Usable Tool for Automatic Test GenerationProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616327(1114-1126)Online publication date: 30-Nov-2023
  • (2022)Human-based Test Design versus Automated Test Generation: A Literature Review and Meta-AnalysisProceedings of the 15th Innovations in Software Engineering Conference10.1145/3511430.3511433(1-11)Online publication date: 24-Feb-2022
  • (2020)How far are we from testing a program in a completely automated way, considering the mutation testing criterion at unit level?Proceedings of the XIX Brazilian Symposium on Software Quality10.1145/3439961.3439977(1-9)Online publication date: 1-Dec-2020
  • Show More Cited By

Recommendations

Reviews

Andrew Brooks

Are automatically generated test suites better than manually written test suites__?__ To answer this question, ten Java applications with existing manually written test suites were tested using the EVOSUITE and CodePro automated test generation tools. Branch coverage and mutation scores were used to assess the quality of test suites. The Jacoco and MAJOR tools were used to calculate these measures. EVOSUITE covered 31.86 percent of branches on average and had an average mutation score of 39.89 percent. For the manually written test suites, the figures were, respectively, 31.5 percent and 42.14 percent. The authors conclude that their results should encourage use of a tool such as EVOSUITE for test production. CodePro's test quality was found to be much lower and this was attributed to absent or weaker oracles. Also investigated was the relationship between branch coverage and mutation score. By inspection, Figures 7 and 8 do indeed suggest correlations are present for EVOSUITE and the manually written test suites. It is unclear, however, what the actual correlation scores are. The investigators imply they calculated non-linear fits, but in Figures 7 and 8, linear lines are drawn. The analysis presented has two major weaknesses. First, there is no discussion or treatment of equivalent mutants. There can be sizable changes in mutation scores when equivalent mutants are factored out. Second, there is no discussion or treatment of the degree to which branches actually tested overlapped with branches actually containing mutations. Despite the shortcomings identified, this paper is strongly recommended to those working in software testing. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
QSIC '14: Proceedings of the 2014 14th International Conference on Quality Software
October 2014
366 pages
ISBN:9781479971985

Publisher

IEEE Computer Society

United States

Publication History

Published: 02 October 2014

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)NaNofuzz: A Usable Tool for Automatic Test GenerationProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616327(1114-1126)Online publication date: 30-Nov-2023
  • (2022)Human-based Test Design versus Automated Test Generation: A Literature Review and Meta-AnalysisProceedings of the 15th Innovations in Software Engineering Conference10.1145/3511430.3511433(1-11)Online publication date: 24-Feb-2022
  • (2020)How far are we from testing a program in a completely automated way, considering the mutation testing criterion at unit level?Proceedings of the XIX Brazilian Symposium on Software Quality10.1145/3439961.3439977(1-9)Online publication date: 1-Dec-2020
  • (2019)Is mutation score a fair metric?Proceedings Companion of the 2019 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity10.1145/3359061.3361084(41-43)Online publication date: 20-Oct-2019
  • (2018)Can automated test case generation cope with extract method validation?Proceedings of the XXXII Brazilian Symposium on Software Engineering10.1145/3266237.3266274(152-161)Online publication date: 17-Sep-2018
  • (2016)The complementary aspect of automatically and manually generated test case setsProceedings of the 7th International Workshop on Automating Test Case Design, Selection, and Evaluation10.1145/2994291.2994295(23-30)Online publication date: 18-Nov-2016
  • (2015)Experience report: how is dynamic symbolic execution different from manual testing? a study on KLEEProceedings of the 2015 International Symposium on Software Testing and Analysis10.1145/2771783.2771818(199-210)Online publication date: 13-Jul-2015

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media