More Web Proxy on the site http://driver.im/

research-article

Open access

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice

Authors:

Philipp Leitner,

Francisco Gomes de Oliveira NetoAuthors Info & Claims

Proceedings of the ACM on Software Engineering, Volume 1, Issue FSE

Article No.: 81, Pages 1819 - 1840

https://doi.org/10.1145/3660788

Published: 12 July 2024 Publication History

Abstract

Large Language Models (LLMs) are frequently discussed in academia and the general public as support tools for virtually any use case that relies on the production of text, including software engineering. Currently, there is much debate, but little empirical evidence, regarding the practical usefulness of LLM-based tools such as ChatGPT for engineers in industry. We conduct an observational study of 24 professional software engineers who have been using ChatGPT over a period of one week in their jobs, and qualitatively analyse their dialogues with the chatbot as well as their overall experience (as captured by an exit survey). We find that rather than expecting ChatGPT to generate ready-to-use software artifacts (e.g., code), practitioners more often use ChatGPT to receive guidance on how to solve their tasks or learn about a topic in more abstract terms. We also propose a theoretical framework for how the (i) purpose of the interaction, (ii) internal factors (e.g., the user's personality), and (iii) external factors (e.g., company policy) together shape the experience (in terms of perceived usefulness and trust). We envision that our framework can be used by future research to further the academic discussion on LLM usage by software engineering practitioners, and to serve as a reference point for the design of future empirical LLM research in this domain.

References

[1]

Aakash Ahmad, Muhammad Waseem, Peng Liang, Mahdi Fahmideh, Mst Shamima Aktar, and Tommi Mikkonen. 2023. Towards Human-Bot Collaborative Software Architecting with ChatGPT. In Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering. Association for Computing Machinery, New York, NY, USA. 279–285. isbn:9798400700446 https://doi.org/10.1145/3593434.3593468

Digital Library

[2]

Stavros Antifakos, Nicky Kern, Bernt Schiele, and Adrian Schwaninger. 2005. Towards improving trust in context-aware systems by displaying system confidence. In Proceedings of the 7th international conference on Human computer interaction with mobile devices & services. Association for Computing Machinery, New York, NY, USA. 9–14. isbn:1595930892 https://doi.org/10.1145/1085777.1085780

Digital Library

[3]

Shraddha Barke, Michael B James, and Nadia Polikarpova. 2023. Grounded Copilot: How Programmers Interact with Code-Generating Models. Proceedings of the ACM on Programming Languages, 7, OOPSLA1 (2023), 85–111. https://doi.org/10.1145/3586030

Digital Library

[4]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NIPS ’20). Curran Associates Inc., Red Hook, NY, USA. Article 159, 25 pages. isbn:9781713829546

Digital Library

[5]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, and Greg Brockman. 2021. Evaluating large language models trained on code. arxiv:arXiv:2107.03374.

[6]

European Commision. 2021. Internal Market, Industry, Entrepreneurship and SMEs. https://single-market-economy.ec.europa.eu/smes/sme-definition_en Accessed on May 10, 2024

[7]

Cristiano da Silva Cintra and Roberto Almeida Bittencourt. 2015. Being a PBL teacher in Computer Engineering: An interpretative phenomenological analysis. In 2015 IEEE Frontiers in Education Conference (FIE). 1–8. https://doi.org/10.1109/FIE.2015.7344234

Digital Library

[8]

Peter de Vries, Cees Midden, and Don Bouwhuis. 2003. The Effects of Errors on System Trust, Self-Confidence, and the Allocation of Control in Route Planning. Int. J. Hum.-Comput. Stud., 58, 6 (2003), jun, 719–735. issn:1071-5819 https://doi.org/10.1016/S1071-5819(03)00039-9

Digital Library

[9]

Glaucia Melo dos Santos, Edith Law, Paulo S. C. Alencar, and Don Cowan. 2020. Exploring Context-Aware Conversational Agents in Software Development. CoRR, abs/2006.02370 (2020), arXiv:2006.02370.

[10]

Mary T. Dzindolet, Scott A. Peterson, Regina A. Pomranky, Linda G. Pierce, and Hall P. Beck. 2003. The Role of Trust in Automation Reliance. Int. J. Hum.-Comput. Stud., 58, 6 (2003), jun, 697–718. issn:1071-5819 https://doi.org/10.1016/S1071-5819(03)00038-7

Digital Library

[11]

Virginia Eatough and Jonathan A Smith. 2017. Interpretative phenomenological analysis. The Sage handbook of qualitative research in psychology, 193–209. https://doi.org/10.4135/9781446207536.d10

[12]

Linda Erlenhov, Francisco Gomes de Oliveira Neto, and Philipp Leitner. 2020. An empirical study of bots in software development: characteristics and challenges from a practitioner’s perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 445–455. isbn:9781450370431 https://doi.org/10.1145/3368089.3409680

Digital Library

[13]

Linda Erlenhov, Francisco Gomes de Oliveira Neto, Riccardo Scandariato, and Philipp Leitner. 2019. Current and future bots in software development. In 2019 IEEE/ACM 1st International Workshop on Bots in Software Engineering (BotSE). 7–11. https://doi.org/10.1109/BotSE.2019.00009

Digital Library

[14]

Saad Ezzini, Sallam Abualhaija, Chetan Arora, and Mehrdad Sabetzadeh. 2023. AI-Based Question Answering Assistance for Analyzing Natural-Language Requirements. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, Piscataway, NJ. 1277–1289. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00113

Digital Library

[15]

Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, and Jenna Butler. 2021. The SPACE of Developer Productivity: There’s more to it than you think. Queue, 19, 1 (2021), 20–48. https://doi.org/10.1145/3454122.3454124

Digital Library

[16]

Mohammad Fraiwan and Natheer Khasawneh. 2023. A Review of ChatGPT Applications in Education, Marketing, Software Engineering, and Healthcare: Benefits, Drawbacks, and Research Directions. arxiv:arXiv:2305.00237.

[17]

Ranim Khojah, Mazen Mohamad, Philipp Leitner, and Francisco Gomes de Oliveira Neto. 2023. Package for An Observational Study of ChatGPT Usage in Software Engineering Practice. https://doi.org/10.5281/zenodo.8383359

[18]

Everlyne Kimani, Kael Rowan, Daniel McDuff, Mary Czerwinski, and Gloria Mark. 2019. A conversational agent in support of productivity and wellbeing at work. In 2019 8th international conference on affective computing and intelligent interaction (ACII). 1–7. https://doi.org/10.1109/ACII.2019.8925488

[19]

Mika Koivisto and Simone Grassini. 2023. Best humans still outperform artificial intelligence in a creative divergent thinking task. Scientific Reports, 13, 1 (2023), Sep, 13601. https://doi.org/10.1038/s41598-023-40858-3

[20]

Caroline Lemieux, Jeevana Priya Inala, Shuvendu K. Lahiri, and Siddhartha Sen. 2023. CodaMosa: Escaping Coverage Plateaus in Test Generation with Pre-Trained Large Language Models. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, Piscataway, NJ. 919–931. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00085

Digital Library

[21]

Antonio Mastropaolo, Luca Pascarella, Emanuela Guglielmi, Matteo Ciniselli, Simone Scalabrino, Rocco Oliveto, and Gabriele Bavota. 2023. On the Robustness of Code Generation Techniques: An Empirical Study on GitHub Copilot. In Proceedings of the 45th International Conference on Software Engineering (ICSE ’23). IEEE Press, Piscataway, NJ. 2149–2160. isbn:9781665457019 https://doi.org/10.1109/ICSE48619.2023.00181

Digital Library

[22]

Nhan Nguyen and Sarah Nadi. 2022. An empirical evaluation of GitHub copilot’s code suggestions. In Proceedings of the 19th International Conference on Mining Software Repositories. Association for Computing Machinery, New York, NY, USA. 1–5. https://doi.org/10.1145/3524842.3528470

Digital Library

[23]

Sida Peng, Eirini Kalliamvakou, Peter Cihon, and Mert Demirer. 2023. The impact of ai on developer productivity: Evidence from github copilot. arxiv:arXiv:2302.06590.

[24]

Chen Qian, Xin Cong, Cheng Yang, Weize Chen, Yusheng Su, Juyuan Xu, Zhiyuan Liu, and Maosong Sun. 2023. Communicative Agents for Software Development. arxiv:arXiv:2307.07924.

[25]

Steven I Ross, Fernando Martinez, Stephanie Houde, Michael Muller, and Justin D Weisz. 2023. The programmer’s assistant: Conversational interaction with a large language model for software development. In Proceedings of the 28th International Conference on Intelligent User Interfaces. Association for Computing Machinery, New York, NY, USA. 491–514. https://doi.org/10.1145/3581641.3584037

Digital Library

[26]

Sivasurya Santhanam, Tobias Hecking, Andreas Schreiber, and Stefan Wagner. 2022. Bots in software engineering: a systematic mapping study. PeerJ Computer Science, 8 (2022), 866. https://doi.org/10.7717/peerj-cs.866

[27]

Margaret-Anne Storey and Alexey Zagalsky. 2016. Disrupting developer productivity one bot at a time. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016). Association for Computing Machinery, New York, NY, USA. 928–931. isbn:9781450342186 https://doi.org/10.1145/2950290.2983989

Digital Library

[28]

Nigar M Shafiq Surameery and Mohammed Y Shakor. 2023. Use Chat GPT to Solve Programming Bugs. International Journal of Information Technology & Computer Engineering (IJITC) ISSN: 2455-5290, 3, 01 (2023), 17–22. https://doi.org/10.55529/ijitc.31.17.22

[29]

Rosalia Tufano, Luca Pascarella, and Gabriele Bavota. 2023. Automating Code-Related Tasks Through Transformers: The Impact of Pre-training. arxiv:arXiv:2302.04048.

[30]

Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA ’22). Association for Computing Machinery, New York, NY, USA. Article 332, 7 pages. isbn:9781450391566 https://doi.org/10.1145/3491101.3519665

Digital Library

[31]

Douglas Walton. 2010. Burden of Proof in Deliberation Dialogs. In Argumentation in Multi-Agent Systems. Springer Berlin Heidelberg, Berlin, Heidelberg. 1–22. https://doi.org/10.1007/978-3-642-12805-9_1

Digital Library

[32]

Muhammad Waseem, Teerath Das, Aakash Ahmad, Mahdi Fehmideh, Peng Liang, and Tommi Mikkonen. 2023. Using ChatGPT throughout the Software Development Life Cycle by Novice Developers. arxiv:arXiv:2310.13648.

[33]

Justin D Weisz, Michael Muller, Steven I Ross, Fernando Martinez, Stephanie Houde, Mayank Agarwal, Kartik Talamadupula, and John T Richards. 2022. Better together? an evaluation of ai-supported code translation. In 27th International Conference on Intelligent User Interfaces. Association for Computing Machinery, New York, NY, USA. 369–391. https://doi.org/10.1145/3490099.3511157

Digital Library

[34]

Andrew Wood, Paige Rodeghero, Ameer Armaly, and Collin McMillan. 2018. Detecting speech act types in developer question/answer conversations during bug repair. In Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. Association for Computing Machinery, New York, NY, USA. 491–502. https://doi.org/10.1145/3236024.3236031

Digital Library

[35]

Yunfeng Zhang, Q. Vera Liao, and Rachel K. E. Bellamy. 2020. Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* ’20). Association for Computing Machinery, New York, NY, USA. 295–305. isbn:9781450369367 https://doi.org/10.1145/3351095.3372852

Digital Library

Index Terms

Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
      1. Natural language interfaces
2. Software and its engineering

Recommendations

A Survey of Software Engineering Practice: Tools, Methods, and Results

The results of a survey of software development practice are reported and analyzed. The problems encountered in various phases of the software life cycle are measured and correlated with characteristics of the responding installations. The use and ...
Software Engineering Education Must Adapt and Evolve for an LLM Environment
SIGCSE 2024: Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1

In the era of artificial intelligence (AI), generative AI, and Large Language Models (LLMs) in particular, have become increasingly significant in various sectors. LLMs such as GPT expand their applications, from content creation to advanced code ...
Software Engineering Research and Practice

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Software Engineering

Proceedings of the ACM on Software Engineering Volume 1, Issue FSE

July 2024

2770 pages

EISSN:2994-970X

DOI:10.1145/3554322

Editor:
Luciano Baresi
Politecnico di Milano, Italy

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-NoDerivs International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2024

Published in PACMSE Volume 1, Issue FSE

Author Tags

Qualifiers

Research-article

Funding Sources

WASP

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
900
Total Downloads

Downloads (Last 12 months)900
Downloads (Last 6 weeks)262

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents