[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Intelligent Code Completion with Bayesian Networks

Published: 02 December 2015 Publication History

Abstract

Code completion is an integral part of modern Integrated Development Environments (IDEs). Developers often use it to explore Application Programming Interfaces (APIs). It is also useful to reduce the required amount of typing and to help avoid typos. Traditional code completion systems propose all type-correct methods to the developer. Such a list is often very long with many irrelevant items. More intelligent code completion systems have been proposed in prior work to reduce the list of proposed methods to relevant items.
This work extends one of these existing approaches, the Best Matching Neighbor (BMN) algorithm. We introduce Bayesian networks as an alternative underlying model, use additional context information for more precise recommendations, and apply clustering techniques to improve model sizes. We compare our new approach, Pattern-based Bayesian Networks (PBN), to the existing BMN algorithm. We extend previously used evaluation methodologies and, in addition to prediction quality, we also evaluate model size and inference speed.
Our results show that the additional context information we collect improves prediction quality, especially for queries that do not contain method calls. We also show that PBN can obtain comparable prediction quality to BMN, while model size and inference speed scale better with large input sizes.

References

[1]
Marcel Bruch and Mira Mezini. 2008. Improving code recommender systems using Boolean factor analysis and graphical models. In Proceedings of the International Workshop on Recommendation Systems for Software Engineering (RSSE'08). ACM Press, New York.
[2]
Marcel Bruch, Martin Monperrus, and Mira Mezini. 2009. Learning from examples to improve code completion systems. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (ESEC/FSE'09). ACM Press, New York, 213--222.
[3]
Marcel Bruch, Thorsten Schafer, and Mira Mezini. 2006. FrUiT: IDE support for framework understanding. In Proceedings of the OOPSLA Workshop on Eclipse Technology eXchange (Eclipse'06). ACM Press, New York, 55--59.
[4]
Raymond P. L. Buse and Westley Weimer. 2012. Synthesizing API usage examples. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 782--792.
[5]
Olivier Chapelle and Ya Zhang. 2009. A dynamic Bayesian network click model for Web search ranking. In Proceedings of the 18th International Conference on World Wide Web (WWW'09). ACM Press, New York, 1--10.
[6]
Stanley F. Chen and Joshua Goodman. 1996. An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th Annual Meeting on Association for Computational Linguistics (ACL'96). Association for Computational Linguistics, 310--318.
[7]
Thomas Cover and Peter Hart. 2006. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 1, 21--27.
[8]
Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. 2013. Complete completion using types and weights. In Proceedings of the 34th Conference on Programming Language Design and Implementation (PLDI'13). ACM Press, New York, 27--38.
[9]
Lars Heinemann, Veronika Bauer, Markus Herrmannsdoerfer, and Benjamin Hummel. 2012. Identifier-based context-dependent API method recommendation. In Proceedings of the 16th European Conference on Software Maintenance and Reengineering (CSMR'12). IEEE, 31--40.
[10]
Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 837--847.
[11]
Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th International Symposium on The Foundations of Software Engineering (ESEC/FSE'05). ACM Press, New York, 306--315.
[12]
Benjamin Livshits and Thomas Zimmermann. 2005. DynaMine: Finding common error patterns by mining software revision histories. In Proceedings of the 10th European Software Engineering Conference Held Jointly with the 13th International Symposium on The Foundations of Software Engineering (ESEC/FSE'05). ACM Press, New York, 296--305.
[13]
Robert Cecil Martin. 2003. Agile Software Development: Principles, Patterns, and Practices. Prentice Hall, PTR, Upper Saddle River, NJ.
[14]
Andrew Mccallum, Kamal Nigam, and Lyle H. Ungar. 2000. Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (KDD'00). ACM Press, New York, 169--178.
[15]
Amir Michail. 2000. Data mining library reuse patterns using generalized association rules. In Proceedings of the 22nd International Conference on Software Engineering (ICSE'00). ACM Press, New York, 167--176.
[16]
Martin Monperrus, Marcel Bruch, and Mira Mezini. 2010. Detecting missing method calls in object-oriented software. In Proceedings of the 24th European Conference on Object-Oriented Programming (ECOOP'10). 2--25.
[17]
Meiyappan Nagappan, Thomas Zimmermann, and Christian Bird. 2013. Diversity in software engineering research. In Proceedings of the 9th Joint Meeting of the European Software Engineering Conference and the Symposium on The Foundations of Software Engineering (ESEC/FSE'13). ACM Press, New York, 466--476.
[18]
Anh Tuan Nguyen, Tung Thanh Nguyen, Hoan Anh Nguyen, Ahmed Tamrawi, Hung Viet Nguyen, Jafar Al-Kofahi, and Tien N. Nguyen. 2012. Graph-based pattern-oriented, context-sensitive source code completion. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 69--79.
[19]
Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, and Tien N. Nguyen. 2009. Graph-based mining of multiple object usage patterns. In Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the Symposium on The Foundations of Software Engineering (ESEC/FSE'09). ACM Press, New York, 383--392.
[20]
Jakob Nielsen. 1994. Usability Engineering. Elsevier, Amsterdam.
[21]
Sebastian Proksch, Sven Amann, and Mira Mezini. 2014. Towards standardized evaluation of developer-assistance tools. In Proceedings of the 4th International Workshop on Recommendation Systems for Software Engineering (RSSE'14). ACM Press, New York, 14--18.
[22]
Irina Rish. 2001. An empirical study of the naive Bayes classifier. In Proceedings of the Workshop on Empirical Methods in Artificial Intelligence (IJCAI'01). IBM, New York, 41--46.
[23]
Martin P. Robillard, Eric Bodden, David Kawrykow, Mira Mezini, and Tristan Ratchford. 2013. Automated API property inference techniques. IEEE Trans. Softw. Engin. 39, 5, 613--637.
[24]
J. Michael Schultz and Mark Liberman. 1999. Topic detection and tracking using idf-weighted cosine coefficient. In Proceedings of the DARPA Broadcast News Workshop. Morgan Kaufmann Publishers, 189--192.
[25]
Olin Shivers. 1988. Control flow analysis in scheme. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI'88). ACM Press, New York, 164--174.
[26]
Olin Shivers. 1991a. Data-flow analysis and type recovery in scheme. In Topics in Advanced Language Implementation. The MIT Press, Cambridge, MA.
[27]
Olin Shivers. 1991b. The semantics of scheme control-flow analysis. In Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM'91). ACM Press, New York, 190--198.
[28]
Alexander Strehl, Joydeep Ghosh, and Raymond Mooney. 2000. Impact of similarity measures on web-page clustering. In Proceedings of the Workshop on Artificial Intelligence for Web Search (AAAI'00). 58--64.
[29]
Xiwang Yang, Yang Guo, and Yong Liu. 2011. Bayesian-inference based recommendation in online social networks. In Proceedings of the INFOCOM Conference (INFOCOM'11). 551--555.
[30]
Cheng Zhang, Juyuan Yang, Yi Zhang, Jing Fan, Xin Zhang, Jianjun Zhao, and Peizhao Ou. 2012. Automatic parameter recommendation for practical API usage. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 826--836.
[31]
Hao Zhong, Lu Zhang, and Hong Mei. 2008. Inferring specifications of object oriented APIs from API source code. In Proceedings of the 15th Asia-Pacific Software Engineering Conference (APSEC'08). IEEE Computer Society, 221--228.

Cited By

View all
  • (2024)A Combinatorial Strategy for API Completion: Deep Learning and HeuristicsElectronics10.3390/electronics1318366913:18(3669)Online publication date: 15-Sep-2024
  • (2024)Multi-line AI-Assisted Code AuthoringCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663836(150-160)Online publication date: 10-Jul-2024
  • (2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology
ACM Transactions on Software Engineering and Methodology  Volume 25, Issue 1
December 2015
339 pages
ISSN:1049-331X
EISSN:1557-7392
DOI:10.1145/2852270
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 December 2015
Accepted: 01 March 2015
Revised: 01 January 2015
Received: 01 February 2014
Published in TOSEM Volume 25, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Content assist
  2. code completion
  3. code recommender
  4. evaluation
  5. integrated development environments
  6. machine learning
  7. productivity

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • German Federal Ministry of Education and Research (BMBF)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)5
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Combinatorial Strategy for API Completion: Deep Learning and HeuristicsElectronics10.3390/electronics1318366913:18(3669)Online publication date: 15-Sep-2024
  • (2024)Multi-line AI-Assisted Code AuthoringCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663836(150-160)Online publication date: 10-Jul-2024
  • (2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
  • (2024)DeciX: Explain Deep Learning Based Code Generation ApplicationsProceedings of the ACM on Software Engineering10.1145/36608141:FSE(2424-2446)Online publication date: 12-Jul-2024
  • (2024)AI-Assisted Code Authoring at Scale: Fine-Tuning, Deploying, and Mixed Methods EvaluationProceedings of the ACM on Software Engineering10.1145/36437741:FSE(1066-1085)Online publication date: 12-Jul-2024
  • (2024)Learning-based Relaxation of Completeness Requirements for Data Entry FormsACM Transactions on Software Engineering and Methodology10.1145/363570833:3(1-32)Online publication date: 15-Mar-2024
  • (2024)PTM-APIRec: Leveraging Pre-trained Models of Source Code in API RecommendationACM Transactions on Software Engineering and Methodology10.1145/363274533:3(1-30)Online publication date: 15-Mar-2024
  • (2024)Code Recommendation for Schema Evolution of Mimic Storage SystemsInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450049935:01(89-110)Online publication date: 28-Oct-2024
  • (2024)Deep Learning-Based Code Completion: On the Impact on Performance of Contextual Information2024 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58944.2024.00029(211-223)Online publication date: 6-Oct-2024
  • (2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media