Abstract
Developers often require knowledge beyond the one they possess, which boils down to asking co-workers for help or consulting additional sources of information, such as Application Programming Interfaces (API) documentation, forums, and Q&A websites. However, it requires time and energy to formulate one’s problem, peruse and process the results. We propose a novel approach that, given a context in the Integrated Development Environment (IDE), automatically retrieves pertinent discussions from Stack Overflow, evaluates their relevance using a multi-faceted ranking model, and, if a given confidence threshold is surpassed, notifies the developer. We have implemented our approach in Prompter, an Eclipse plug-in. Prompter was evaluated in two empirical studies. The first study was aimed at evaluatingPrompter’s ranking model and involved 33 participants. The second study was conducted with 12 participants and aimed at evaluating Prompter’s usefulness when supporting developers during development and maintenance tasks. Since Prompter uses “volatile information” crawled from the web, we also replicated Study I after one year to assess the impact of such a “volatility” on recommenders like Prompter. Our results indicate that (i) Prompter recommendations were positively evaluated in 74 % of the cases on average, (ii) Prompter significantly helps developers to improve the correctness of their tasks by 24 % on average, but also (iii) 78 % of the provided recommendations are “volatile” and can change at one year of distance. While Prompter revealed to be effective, our studies also point out issues when building recommenders based on information available on online forums.
Similar content being viewed by others
Notes
The s threshold is customizable. By default it is set to 5.
Classes are identified by the unique id projectName.packageName.ClassName, methods are identified by projectName.packageName.ClassName.methodSignature
During the break participants did not have the chance to exchange information among them.
References
Anvik J, Hiew L, Murphy G (2006) Who should fix this bug?. In: Proceedings of ICSE 2006, 361–370. ACM
Bacchelli A, dal Sasso T, D’Ambros M, Lanza M (2012) Content classification of development emails. In: Proceedings of ICSE 2012, 375–385
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley
Bajracharya S, Lopes C (2009) Mining search topics from a code search engine usage log. In: Proceedings of MSR 2009, 111–120
Bajracharya S, Lopes C (2012) Analyzing and mining a code search engine usage log. Empir Softw Eng 17(4-5):424–466
Bajracharya S, Ngo T, Linstead E, Rigor P, Dou Y, Baldi P, Lopes C (2006) Sourcerer: A search engine for open source code supporting structure-based search. In: Proceedings of OOPSLA 2006, 25–26
Baker RD (1995) Modern permutation test software. In: Randomization Tests. Marcel Decker
Constantine L (1995) Constantine on Peopleware. Yourdon
Cordeiro J, Antunes B, Gomes P (2012) Context-based recommendation to support problem solving in software development. In: Proceedings of RSSE 2012, 85–89. IEEE Press
Cubranic D, Murphy G (2003) Hipikat: recommending pertinent software development artifacts. In: Proceedings of ICSE 2003, 408–418. IEEE Press
Goldman M, Miller R (2009) Codetrail: Connecting source code and web resources. Journal of Visual Languages & Computing
Grissom RJ, Kim JJ (2005) Effect sizes for research: A broad practical approach. Lawrence Associates
Haiduc S, Bavota G, Marcus A, Oliveto R, De Lucia A, Menzies T (2013) Automatic query reformulations for text retrieval in software engineering. In: 35th International Conference on Software Engineering, ICSE ’13, San Francisco, CA, USA, May 18-26, 2013, 842–851. http://dl.acm.org/citation.cfm?id=2486898
Haiduc S, Bavota G, Oliveto R, De Lucia A, Marcus A (2012) Automatic query performance assessment during the retrieval of software artifacts. In: IEEE/ACM International Conference on Automated Software Engineering, ASE’12, Essen, Germany, September 3-7, 2012, 90–99, doi:10.1145/2351676.2351690, (to appear in print)
Haiduc S, Bavota G, Oliveto R, Marcus A, De Lucia A (2012) Evaluating the specificity of text retrieval queries to support software engineering tasks. In: 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, 1273–1276, doi: 10.1109/ICSE.2012.6227101, (to appear in print)
Hassan AE (2009) Predicting faults using the complexity of code changes. In: 31st International Conference on Software Engineering, ICSE 2009, May 16-24, 2009, Vancouver, Canada, Proceedings, 78–88, doi:10.1109/ICSE.2009.5070510, (to appear in print)
Hintze JL, Nelson RD (1998) Violin plots: A box plot-density trace synergism. Am Stat 52(2):181–184
Holm S (1979) A simple sequentially rejective Bonferroni test procedure. Scand J Stat 6:65–70
Holmes R, Begel A (2008) Deep intellisense: a tool for rehydrating evaporated information. In: Proceedings of MSR 2008, 23–26. ACM
Holmes R, Walker R, Murphy G (2005) Strathcona example recommendation tool. SIGSOFT Software Engineering Notes 30:237–240
Holmes R, Walker R, Murphy G (2006) Approximate structural context matching: An approach to recommend relevant examples. IEEE TSE 32(12):952–970
Horvitz E, Breese J, Heckerman D, Hovel D, Rommelse K (1998) The lumière project: Bayesian user modeling for inferring the goals and needs of software users. In: Proceedings of UAI 1998 (14th Conference on Uncertainty in Artificial Intelligence), 256–265. Morgan Kaufmann Publishers Inc
Kersten M, Murphy G (2006) Using task context to improve programmer productivity. In: Proceedings of FSE-14, 1–11. ACM Press
Ko AJ, DeLine R, Venolia G (2007) Information needs in collocated software development teams. In: Proceedings of ICSE 2007, 344–353. IEEE CS Press
Kononenko O, Dietrich D, Sharma R, Holmes R (2012) Automatically locating relevant programming help online. In: Proceedings of VL/HCC 2012, 127–134
LaToza TD, Venolia G, DeLine R (2006) Maintaining mental models: a study of developer work habits. In: Proceedings of ICSE 2006, 492–501. ACM
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Cybern Control Theory 10:707–710
Linstead E, Rigor P, Bajracharya S, Lopes C, Baldi P (2007) Mining internet-scale software repositories. In: In Proceedings of NIPS 2007. MIT Press
Lohar S, Amornborvornwong S, Zisman A, Cleland-Huang J (2013) Improving trace accuracy through data-driven configuration and composition of tracing features. In: Proceedings of ESEC/FSE 2013, 378–388. ACM
Mamykina L, Manoim B, Mittal M, Hripcsak G, Hartmann B Design lessons from the fastest q&a site in the west. In: Proceedings of CHI 2011, 2857–2866. ACM
Mandelin D, Xu L, Bodík R, Kimelman D (2005) Jungloid mining: Helping to navigate the api jungle. In: Proceedings of PLDI 2005 (16th ACM SIGPLAN Conference on Programming Language Design and Implementation), 48–61. ACM
Manning C, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press
McMillan C, Grechanik M, Poshyvanyk D, Fu C, Xie Q (2012) A source code search engine for finding highly relevant applications. IEEE TSE 38(5):1069–1087
McMillan C, Grechanik M, Poshyvanyk D, Xie Q, Fu C (2011) Portfolio: finding relevant functions and their usage. In: Proceedings of ICSE 2011, 111–120. ACM
Oppenheim AN (1992) Questionnaire design, interviewing and attitude measurement. Pinter, London
Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? an approach based on genetic algorithms. In: Proceedings of ICSE 2013, 522–531. ACM/IEEE
Ponzanelli L, Bacchelli A, Lanza M (2013) Leveraging crowd knowledge for software comprehension and development. In: Proceedings of CSMR 2013, 59–66
Ponzanelli L, Bacchelli A, Lanza M (2013) Seahawk: Stack overflow in the ide. In: Proceedings of ICSE 2013, Tool Demo Track, 1295–1298. IEEE
Core Team R (2012) R: a language and environment for statistical computing. Vienna, Austria. http://www.R-project.org. ISBN 3-900051-07-0
Reid RH, Murphy GC (2005) Using structural context to recommend source code examples. In: Proceedings of ICSE 2005, 117–125. ACM
Reiss S (2009) Semantics-based code search. In: Proceedings of ICSE 2009, 243–253. IEEE
Rigby P, Robillard M (2013) Discovering essential code elements in informal documentation. In: Proceedings of ICSE 2013, 832–841
Robertson S (2004) Understanding inverse document frequency: On theoretical arguments for IDF. J Doc 60:2004
Robillard M, Walker R, Zimmermann T (2010) Recommendation systems for software engineering. IEEE Software
Sawadsky N, Murphy G (2011) Fishtail: from task context to source code examples. In: Proceedings of TOPI 2011, 48–51. ACM
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423. 625–56
Sheskin DJ (2007) Handbook of parametric and nonparametric statistical procedures (fourth edition). Chapman & All
Sim S, Umarji M, Ratanotayanon S, Lopes C (2011) How well do search engines support code retrieval on the web ACM TOSEM:1–25
Stylos J, Myers BA (2006) Mica: A web-search tool for finding api components and examples. In: Proceedings of VL/HCC 2006, 195–202
Subramanian S, Inozemtseva L, Holmes R (2014) Live api documentation. In: Proceedings of ICSE 2014 (36th International Conference on Software Engineering), ICSE 2014, 643–652. ACM
Takuya W, Masuhara H (2011) A spontaneous code recommendation tool based on associative search. In: Proceedings of SUITE 2011, pp. 17–20. ACM
Thummalapenta S (2007) Exploiting code search engines to improve programmer productivity. In: Proceedings of OOPSLA 2007, 921–922. ACM
Thummalapenta S, Xie T (2007) Parseweb: a programmer assistant for reusing open source code on the web. In: Proceedings of ASE 2007, 204–213. ACM
Thummalapenta S, Xie T (2008) Spotweb: Detecting framework hotspots and coldspots via mining open source code on the web. In: Proceedings of ASE 2008, 327–336. IEEE
Umarji M, Sim S, Lopes C (2008) Archetypal internet-scale source code searching. In: Proceedings of OSS 2008, 257–263
Vassallo C, Panichella S, Di Penta M, Canfora G (2014) Codes: mining source code descriptions from developers discussions. In: 22nd International Conference on Program Comprehension, ICPC 2014, Hyderabad, India, June 2-3, 2014, 106–109
Wang T, Harman M, Jia Y, Krinke J (2013) Searching for better configurations: a rigorous approach to clone evaluation. In: Proceedings of ESEC/FSE 2013, 455–465. ACM
Wettel R, Marinescu R (2005) Archeology of code duplication: recovering duplication chains from small duplication fragments. In: Proceedings of SYNASC 2005, 63–70
Williams L (2001) Integrating pair programming into a software development process. In: Proceedings of CSEET 2001, 27–36. IEEE
Zimmermann T, Weißgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: Proceedings of ICSE 2004, 563–572. IEEE
Acknowledgements
Luca Ponzanelli and Michele Lanza thank the Swiss National Science foundation for the financial support through SNF Project “ESSENTIALS”, No. 153129.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Sung Kim and Martin Pinzger
Rights and permissions
About this article
Cite this article
Ponzanelli, L., Bavota, G., Di Penta, M. et al. Prompter. Empir Software Eng 21, 2190–2231 (2016). https://doi.org/10.1007/s10664-015-9397-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-015-9397-1