[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Detecting semantic conflicts with unit tests

Published: 18 July 2024 Publication History

Abstract

While modern merge techniques, such as 3-way and structured merge, can resolve textual conflicts automatically, they fail when the conflict arises not at the syntactic, but at the semantic level. Detecting such semantic conflicts requires understanding the behavior of the software, which is beyond the capabilities of most existing merge tools. Although semantic merge tools have been proposed, they are usually based on heavyweight static analyses, or need explicit specifications of program behavior. In this work, we take a different route and propose SAM (SemAntic Merge), a semantic merge tool based on the automated generation of unit tests that are used as partial specifications of the changes to be merged, and that drive the detection of unwanted behavior changes (conflicts) when merging software. To evaluate SAM’s feasibility for detecting conflicts, we perform an empirical study relying on a dataset of more than 80 pairs of changes integrated to common class elements (constructors, methods, and fields) from 51 merge scenarios. We also assess how the four unit test generation tools used by SAM individually contribute to conflict identification. Our results show that SAM performs best when combining only the tests generated by Differential EvoSuite and EvoSuite, and using our proposed testability transformations (nine detected conflicts out of 29). These results reinforce previous findings about the potential of using test-case generation to detect conflicts as a method that is versatile and requires only limited deployment effort in practice.

Highlights

SAM, our semantic merge tool based on unit test generation tools;
New criteria for the detection of semantic conflicts based on unit testing;
A dataset of 85 merge scenarios with and without semantic conflicts;
Randoop Clean, our extended version of the standard tool.

References

[1]
Accioly P., Borba P., Cavalcanti G., Understanding semi-structured merge conflict characteristics in open-source Java projects, Empir. Softw. Eng. 23 (4) (2018) 2051–2085.
[2]
Adams B., McIntosh S., Modern release engineering in a nutshell–why researchers should care, in: International Conference on Software Analysis, Evolution, and Reengineering, IEEE, 2016.
[3]
Almasi M.M., Hemmati H., Fraser G., Arcuri A., An industrial evaluation of unit test generation: Finding real faults in a financial application, in: International Conference on Software Engineering, IEEE, 2017.
[4]
Apel S., Leßenich O., Lengauer C., Structured merge with auto-tuning: balancing precision and performance, in: International Conference on Automated Software Engineering, ACM, 2012.
[5]
Apel S., Liebig J., Brandl B., Lengauer C., Kästner C., Semistructured merge: rethinking merge in revision control systems, in: European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ACM, 2011.
[6]
Arcuri A., Galeotti J.P., Enhancing search-based testing with testability transformations for existing apis, ACM Trans. Softw. Eng. Methodol. (TOSEM) 31 (1) (2021) 1–34.
[7]
Barros Filho R.S., Using Information Flow to Estimate Interference Between Developers Same-Method Contributions, (Master’s thesis) Universidade Federal de Pernambuco, 2017.
[8]
Bass L., Weber I., Zhu L., DevOps: A Software Architect’s Perspective, Addison-Wesley Professional, 2016.
[9]
Binkley D., Horwitz S., Reps T., Program integration for languages with procedure calls, ACM Trans. Softw. Eng. Methodol. (TOSEM) 4 (1) (1995) 3–35.
[10]
Bird C., Zimmermann T., Assessing the value of branches with what-if analysis, in: Symposium on the Foundations of Software Engineering, ACM, 2012.
[11]
Brun Y., Holmes R., Ernst M.D., Notkin D., Early detection of collaboration conflicts and risks, IEEE Trans. Softw. Eng. 39 (10) (2013) 1358–1375.
[12]
Campos, J., Arcuri, A., Fraser, G., Abreu, R., 2014. Continuous test generation: Enhancing continuous integration with automated test generation. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering. pp. 55–66.
[13]
Cavalcanti G., Borba P., Accioly P., Evaluating and improving semistructured merge, ACM Trans. Program. Lang. Syst. 1 (OOPSLA) (2017) 59:1–59:27.
[14]
Cavalcanti G., Borba P., Seibt G., Apel S., The impact of structure on software merging: semistructured versus structured merge, in: International Conference on Automated Software Engineering, IEEE, 2019.
[15]
Da Silva L., Borba P., Mahmood W., Berger T., Moisakis J., Detecting semantic conflicts via automated behavior change detection, in: International Conference on Software Maintenance and Evolution, IEEE, 2020, pp. 174–184,.
[16]
Da Silva L., Borba P., Pires A., Build conflicts in the wild, J. Softw.: Evol. Process 34 (4) (2022).
[17]
de Jesus G.S., Borba P.H.M., de Almeida R.B., de Oliveira M.B., Detecting semantic conflicts using static analysis, 2023, arXiv preprint arXiv:2310.04269.
[18]
de Souza C.R.B., Redmiles D., Dourish P., Breaking the code, moving between private and public work in collaborative software development, in: International ACM SIGGROUP Conference on Supporting Group Work, ACM, 2003.
[19]
Dias K., Borba P., Barreto M., Understanding predictive factors for merge conflicts, Inf. Softw. Technol. 121 (2020).
[20]
Elbaum, S., Chin, H.N., Dwyer, M.B., Dokulil, J., 2006. Carving differential unit test cases from system test cases. In: Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering. pp. 253–264.
[21]
Evans R.B., Savoia A., Differential testing: a new approach to change detection, in: European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ACM, 2007.
[22]
Fowler M., Feature toggle, 2017, accessed: December 2017. URL https://goo.gl/QfJ6mM.
[23]
Fraser G., A tutorial on using and extending the evosuite search-based test generator, in: Search-Based Software Engineering, Springer, 2018.
[24]
Fraser G., Ammann P., Reachability and propagation for ltl requirements testing, in: 2008 the Eighth International Conference on Quality Software, IEEE, 2008, pp. 189–198.
[25]
Fraser G., Arcuri A., Whole test suite generation, IEEE Trans. Softw. Eng. 39 (2) (2012) 276–291.
[26]
Fraser G., Arcuri A., 1600 Faults in 100 projects: automatically finding faults while achieving high coverage with evosuite, Empir. Softw. Eng. 20 (3) (2015) 611–639.
[27]
Grinter R.E., Supporting articulation work using software configuration management systems, Comput. Support. Coop. Work 5 (4) (1996) 447–465.
[28]
Hejderup J., Gousios G., Can we trust tests to automate dependency updates? a case study of java projects, J. Syst. Softw. 183 (2022).
[29]
Henderson F., Software engineering at Google, 2017, accessed: December 2017. URL https://arxiv.org/abs/1702.01715.
[30]
Hodgson P., Feature branching vs. feature flags: What’s the right tool for the job?, 2017, accessed: December 2017. URL https://goo.gl/4D2AMv.
[31]
Horwitz S., Prins J., Reps T., Integrating noninterfering versions of programs, ACM Trans. Program. Lang. Syst. 11 (3) (1989) 345–387.
[32]
[33]
Jin W., Orso A., Xie T., Automated behavioral regression testing, in: International Conference on Software Testing, Verification and Validation, IEEE, 2010.
[34]
Kasi B.K., Sarma A., Cassandra: proactive conflict minimization through optimized task scheduling, in: International Conference on Software Engineering, IEEE, 2013.
[35]
Khanna S., Kunal K., Pierce B.C., A formal investigation of diff3, in: International Conference on Foundations of Software Technology and Theoretical Computer Science, Springer-Verlag, 2007.
[36]
Luo, Q., Hariri, F., Eloussi, L., Marinov, D., 2014. An empirical analysis of flaky tests. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. pp. 643–653.
[37]
Mahmood W., Chagama M., Berger T., Hebig R., Causes of merge conflicts: A case study of elasticsearch, in: International Working Conference on Variability Modelling of Software-Intensive Systems, ACM, 2020.
[38]
McKee S., Nelson N., Sarma A., Dig D., Software practitioner perspectives on merge conflicts and resolutions, in: International Conference on Software Maintenance and Evolution, IEEE, 2017.
[39]
Mens T., A state-of-the-art survey on software merging, IEEE Trans. Softw. Eng. 28 (5) (2002) 449–462.
[40]
Nagappan M., Zimmermann T., Bird C., Diversity in software engineering research, in: Joint Meeting on Foundations of Software Engineering, ACM, 2013, pp. 466–476,.
[41]
Nguyen H.V., Kästner C., Nguyen T.N., Exploring variability-aware execution for testing plugin-based web applications, in: International Conference on Software Engineering, IEEE, 2014.
[42]
Nguyen H.V., Nguyen M.H., Dang S.C., Kästner C., Nguyen T.N., Detecting semantic merge conflicts with variability-aware execution, in: Symposium on the Foundations of Software Engineering, ACM, 2015.
[44]
Owhadi-Kareshk, M., Nadi, S., Rubin, J., 2019. Predicting merge conflicts in collaborative software development. In: International Symposium on Empirical Software Engineering and Measurement.
[45]
Pacheco C., Ernst M.D., Randoop: feedback-directed random testing for Java, in: ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications, ACM, 2007.
[46]
Pacheco C., Lahiri S.K., Ernst M.D., Ball T., Feedback-directed random test generation, in: International Conference on Software Engineering, IEEE, 2007.
[47]
Perry D.E., Siy H.P., Votta L.G., Parallel changes in large-scale software development: an observational case stud, ACM Trans. Softw. Eng. Methodol. 10 (3) (2001) 308–337.
[48]
Potvin R., Levenberg J., Why Google stores billions of lines of code in a single repository, Commun. ACM 59 (7) (2016) 78–87.
[49]
Sarma A., Redmiles D.F., Van Der Hoek A., Palantir: Early detection of development conflicts arising from parallel code changes, IEEE Trans. Softw. Eng. 38 (4) (2012) 889–908.
[50]
Shamshiri S., Automated unit test generation for evolving software, in: Proceedings of Foundations of Software Engineering, 2015, pp. 1038–1041,.
[51]
Shamshiri S., Fraser G., Mcminn P., Orso A., Search-based propagation of regression faults in automated regression testing, in: International Conference on Software Testing, Verification and Validation, IEEE, 2013.
[52]
Shen B., Zhang W., Zhao H., Liang G., Jin Z., Wang Q., Intellimerge: A refactoring-aware software merging technique, ACM Trans. Program. Lang. Syst. 3 (OOPSLA) (2019).
[53]
Silva I.P., Alves E.L., Andrade W.L., Analyzing automatic test generation tools for refactoring validation, in: 2017 IEEE/ACM 12th International Workshop on Automation of Software Testing, AST, IEEE, 2017, pp. 38–44.
[54]
Sousa M., Dillig I., Lahiri S.K., Verified three-way program merge, ACM Trans. Program. Lang. Syst. 2 (OOPSLA) (2018) 1–29.
[55]
Tavares A.T., Borba P., Cavalcanti G., Soares S., Semistructured merge in JavaScript systems, in: International Conference on Automated Software Engineering, IEEE, 2019.
[56]
Tiwari D., Zhang L., Monperrus M., Baudry B., Production monitoring to improve test suites, IEEE Trans. Reliab. (2021).
[57]
Voas J.M., Pie: A dynamic failure-based technique, IEEE Trans. Softw. Eng. 18 (8) (1992) 717.
[58]
Wąsowski Andrzej, Berger Thorsten, Domain-Specific Languages: Effective modeling, automation, and reuse, Springer, 2023.
[59]
Wuensche T., Andrzejak A., Schwedes S., Detecting higher-order merge conflicts in large software projects, in: International Conference on Software Testing, Validation and Verification, IEEE, 2020.
[60]
Zimmermann T., Mining workspace updates in cvs, in: International Conference on Mining Software Repositories, IEEE, 2007.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Systems and Software
Journal of Systems and Software  Volume 214, Issue C
Aug 2024
323 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 18 July 2024

Author Tags

  1. Semantic conflicts
  2. Differential testing
  3. Behavior change

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media