[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3475716.3484488acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
keynote

How Empirical Research Supports Tool Development: A Retrospective Analysis and new Horizons

Published: 11 October 2021 Publication History

Abstract

Empirical research provides two-fold support to the development of approaches and tools aimed at supporting software engineers. On the one hand, empirical studies help to understand a phenomenon or context of interest. On the other hand, studies compare approaches and evaluate how software engineers could benefit from them. Over the past decades, there has been a tangible evolution in how empirical evaluation is conducted in software engineering. This is due to multiple reasons. First, the research community has matured a lot thanks also to guidelines developed by several researchers. Second, the large availability of data and artifacts, mainly from the open-source, has made it possible to conduct larger evaluations, and in some cases to reach study participants. This keynote will first overview how empirical research has been used over the past decades to evaluate tools, and how this is changing over the years. Then, we will focus on the importance of combining quantitative and qualitative evaluations, and how sometimes "depth" turns out to be more useful than just "breadth". We will also emphasize how research is not a straightforward path, and negative results are often an essential component for future advances. Last, but not least, we will discuss how the role of empirical evaluation is changing with the pervasiveness of artificial intelligence methods in software engineering research.

References

[1]
Giuliano Antoniol, Kamel Ayari, Massimiliano Di Penta, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2008. Is it a bug or an enhancement?: a text-based approach to classify change requests. In Proceedings of the 2008 conference of the Centre for Advanced Studies on Collaborative Research, October 27-30, 2008, Richmond Hill, Ontario, Canada. 23. https://doi.org/10.1145/1463788.1463819
[2]
Andrea Arcuri and Lionel C. Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. In Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011, Waikiki, Honolulu, HI, USA, May 21-28, 2011. 1--10. https://doi.org/10.1145/1985793.1985795
[3]
Andrea Arcuri and Lionel C. Briand. 2014. A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verification Reliab. 24, 3 (2014), 219--250. https://doi.org/10.1002/stvr.1486
[4]
Adrian Bachmann, Christian Bird, Foyzur Rahman, Premkumar T. Devanbu, and Abraham Bernstein. 2010. The missing links: bugs and bug-fix commits. In Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2010, Santa Fe, NM, USA, November 7-11, 2010. 97--106. https://doi.org/10.1145/1882291.1882308
[5]
Sebastian Baltes, Christoph Treude, and Stephan Diehl. 2019. SOTorrent: studying the origin, evolution, and usage of stack overflow code snippets. In Proceedings of the 16th International Conference on Mining Software Repositories, MSR 2019, 26-27 May 2019, Montreal, Canada. 191--194. https://doi.org/10.1109/MSR.2019.00038
[6]
Titus Barik and Emerson R. Murphy-Hill. 2016. A process for surviving survey design and sailing through survey deployment. In Perspectives on Data Science for Software Engineering. 213--219. https://doi.org/10.1016/b978-0-12-804206-9.00039-8
[7]
Moritz Beller, Georgios Gousios, and Andy Zaidman. 2017. TravisTorrent: synthesizing Travis CI and GitHub for full-stack research on continuous integration. In Proceedings of the 14th International Conference on Mining Software Repositories, MSR 2017, Buenos Aires, Argentina, May 20-28, 2017. 447--450. https://doi.org/10.1109/MSR.2017.24
[8]
Christian Bird, Adrian Bachmann, Eirik Aune, John Duffy, Abraham Bernstein, Vladimir Filkov, and Premkumar T. Devanbu. 2009. Fair and balanced?: bias in bug-fix datasets. In Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009, Amsterdam, The Netherlands, August 24-28, 2009. 121--130. https://doi.org/10.1145/1595696.1595716
[9]
Pearl Brereton, Barbara A. Kitchenham, David Budgen, Mark Turner, and Mohamed Khalil. 2007. Lessons from applying the systematic literature review process within the software engineering domain. J. Syst. Softw. 80, 4 (2007), 571--583. https://doi.org/10.1016/_j.jss.2006.07.009
[10]
Massimiliano Di Penta. 2016. Combining quantitative and qualitative methods (when mining software data). In Perspectives on Data Science for Software Engineering. 205--211. https://doi.org/10.1016/b978-0-12-804206-9.00038-6
[11]
Nicolas E. Gold and Jens Krinke. 2020. Ethical Mining: A Case Study on MSR Mining Challenges. In MSR '20: 17th International Conference on Mining Software Repositories, Seoul, Republic of Korea, 29-30 June, 2020. 265--276. https://doi.org/10.1145/3379597.3387462
[12]
Georgios Gousios. 2013. The GHTorent dataset and tool suite. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR 13, San Francisco, CA, USA, May 18-19, 2013. 233--236. https://doi.org/10.1109/MSR.2013.6624034
[13]
Kim Herzig, Sascha Just, and Andreas Zeller. 2015. It's Not a Bug, It's a Feature: How Misclassification Impacts Bug Prediction. In Software Engineering & Management 2015, Multikonferenz der GI-Fachbereiche Softwaretechnik (SWT) und Wirtschaftsinformatik (WI), FA WI-MAW, 17. März - 20. März 2015, Dresden, Germany. 103--104.
[14]
Michael Hilton, Nicholas Nelson, Timothy Tunnell, Darko Marinov, and Danny Dig. 2017. Trade-offs in continuous integration: assurance, security, and flexibility. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2017, Paderborn, Germany, September 4-8, 2017. 197--207. https://doi.org/10.1145/3106237.3106270
[15]
Rashina Hoda. 2021. Socio-Technical Grounded Theory for Software Engineering. IEEE Trans. Software Eng. (2021), 1--1. https://doi.org/10.1109/TSE.2021.3106280
[16]
Natalia Juristo Juzgado and Ana María Moreno. 2001. Basics of software engineering experimentation. Springer.
[17]
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. Germán, and Daniela E. Damian. 2014. The promises and perils of mining GitHub. In 11th Working Conference on Mining Software Repositories, MSR 2014, Proceedings, May 31 - June 1, 2014, Hyderabad, India. 92--101. https://doi.org/10.1145/2597073.2597074
[18]
Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2016. The emerging role of data scientists on software development teams. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016. 96--107. https://doi.org/10.1145/2884781.2884783
[19]
Barbara A. Kitchenham and Shari Lawrence Pfleeger. 2008. Personal Opinion Surveys. In Guide to Advanced Empirical Software Engineering. 63--92. https://doi.org/10.1007/978-1-84800-044-5_3
[20]
Bin Lin, Fiorella Zampetti, Gabriele Bavota, Massimiliano Di Penta, and Michele Lanza. 2019. Pattern-based mining of opinions in Q&A websites. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019. 548--559. https://doi.org/10.1109/ICSE.2019.00066
[21]
Bin Lin, Fiorella Zampetti, Gabriele Bavota, Massimiliano Di Penta, Michele Lanza, and Rocco Oliveto. 2018. Sentiment analysis for software engineering: how far can we go?. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018. 94--104. https://doi.org/10.1145/3180155.3180195
[22]
Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrian Marcus, and Gerardo Canfora. 2017. ARENA: An Approach for the Automated Generation of Release Notes. IEEE Trans. Software Eng. 43, 2 (2017), 106--127. https://doi.org/10.1109/TSE.2016.2591536
[23]
Phuong T. Nguyen, Juri Di Rocco, Claudio Di Sipio, Davide Di Ruscio, and Massimiliano Di Penta. 2021. Recommending API Function Calls and Code Snippets to Support Software Development. IEEE Trans. Software Eng. (2021), 1--1. https://doi.org/10.1109/TSE.2021.3059907
[24]
Moses Openja, Bram Adams, and Foutse Khomh. 2020. Analysis of Modern Release Engineering Topics: - A Large-Scale Study using StackOverflow -. In IEEE International Conference on Software Maintenance and Evolution, ICSME 2020, Adelaide, Australia, September 28 - October 2, 2020. 104--114. https://doi.org/10.1109/ICSME46990.2020.00020
[25]
Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Michele Lanza. 2014. Mining StackOverflow to turn the IDE into a self-confident programming prompter. In 11th Working Conference on Mining Software Repositories, MSR 2014, Proceedings, May 31 - June 1, 2014, Hyderabad, India. 102--111. https://doi.org/10.1145/2597073.2597077
[26]
Martin P. Robillard, Walid Maalej, Robert J. Walker, and Thomas Zimmermann (Eds.). 2014. Recommendation Systems in Software Engineering. Springer. https://doi.org/10.1007/978-3-642-45135-5
[27]
Per Runeson, Martin Höst, Austen Rainer, and Björn Regnell. 2012. Case Study Research in Software Engineering - Guidelines and Examples. Wiley. http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118104358.html
[28]
Pradeep K. Venkatesh, Shaohua Wang, Feng Zhang, Ying Zou, and Ahmed E. Hassan. 2016. What Do Client Developers Concern When Using Web APIs? An Empirical Study on Developer Forums and Stack Overflow. In IEEE International Conference on Web Services, ICWS 2016, San Francisco, CA, USA, June 27 - July 2, 2016. 131--138. https://doi.org/10.1109/ICWS.2016.25
[29]
Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, and Björn Regnell. 2012. Experimentation in Software Engineering. Springer. https://doi.org/10.1007/978-3-642-29044-2
[30]
Fiorella Zampetti, Carmine Vassallo, Sebastiano Panichella, Gerardo Canfora, Harald C. Gall, and Massimiliano Di Penta. 2020. An empirical characterization of bad practices in continuous integration. Empir. Softw. Eng. 25, 2 (2020), 1095--1135. https://doi.org/10.1007/s10664-019-09785-8
[31]
Thomas Zimmermann. 2016. Card-sorting. In Perspectives on Data Science for Software Engineering. 137--141. https://doi.org/10.1016/b978-0-12-804206-9.00027-1

Index Terms

  1. How Empirical Research Supports Tool Development: A Retrospective Analysis and new Horizons

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ESEM '21: Proceedings of the 15th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
    October 2021
    368 pages
    ISBN:9781450386654
    DOI:10.1145/3475716
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 October 2021

    Check for updates

    Author Tags

    1. Empirical Evaluation
    2. Open Source
    3. Qualitative Studies
    4. Recommender Systems

    Qualifiers

    • Keynote
    • Research
    • Refereed limited

    Conference

    ESEM '21
    Sponsor:

    Acceptance Rates

    ESEM '21 Paper Acceptance Rate 24 of 124 submissions, 19%;
    Overall Acceptance Rate 130 of 594 submissions, 22%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 153
      Total Downloads
    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media