More Web Proxy on the site http://driver.im/

research-article

Intelligent Code Completion with Bayesian Networks

Authors:

Sebastian Proksch,

Johannes Lerch,

Mira MeziniAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology (TOSEM), Volume 25, Issue 1

Article No.: 3, Pages 1 - 31

https://doi.org/10.1145/2744200

Published: 02 December 2015 Publication History

Abstract

Code completion is an integral part of modern Integrated Development Environments (IDEs). Developers often use it to explore Application Programming Interfaces (APIs). It is also useful to reduce the required amount of typing and to help avoid typos. Traditional code completion systems propose all type-correct methods to the developer. Such a list is often very long with many irrelevant items. More intelligent code completion systems have been proposed in prior work to reduce the list of proposed methods to relevant items.

This work extends one of these existing approaches, the Best Matching Neighbor (BMN) algorithm. We introduce Bayesian networks as an alternative underlying model, use additional context information for more precise recommendations, and apply clustering techniques to improve model sizes. We compare our new approach, Pattern-based Bayesian Networks (PBN), to the existing BMN algorithm. We extend previously used evaluation methodologies and, in addition to prediction quality, we also evaluate model size and inference speed.

Our results show that the additional context information we collect improves prediction quality, especially for queries that do not contain method calls. We also show that PBN can obtain comparable prediction quality to BMN, while model size and inference speed scale better with large input sizes.

References

[1]

Marcel Bruch and Mira Mezini. 2008. Improving code recommender systems using Boolean factor analysis and graphical models. In Proceedings of the International Workshop on Recommendation Systems for Software Engineering (RSSE'08). ACM Press, New York.

Digital Library

[2]

Marcel Bruch, Martin Monperrus, and Mira Mezini. 2009. Learning from examples to improve code completion systems. In Proceedings of the 7^th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (ESEC/FSE'09). ACM Press, New York, 213--222.

Digital Library

[3]

Marcel Bruch, Thorsten Schafer, and Mira Mezini. 2006. FrUiT: IDE support for framework understanding. In Proceedings of the OOPSLA Workshop on Eclipse Technology eXchange (Eclipse'06). ACM Press, New York, 55--59.

Digital Library

[4]

Raymond P. L. Buse and Westley Weimer. 2012. Synthesizing API usage examples. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 782--792.

Digital Library

[5]

Olivier Chapelle and Ya Zhang. 2009. A dynamic Bayesian network click model for Web search ranking. In Proceedings of the 18^th International Conference on World Wide Web (WWW'09). ACM Press, New York, 1--10.

Digital Library

[6]

Stanley F. Chen and Joshua Goodman. 1996. An empirical study of smoothing techniques for language modeling. In Proceedings of the 34^th Annual Meeting on Association for Computational Linguistics (ACL'96). Association for Computational Linguistics, 310--318.

Digital Library

[7]

Thomas Cover and Peter Hart. 2006. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 1, 21--27.

Digital Library

[8]

Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. 2013. Complete completion using types and weights. In Proceedings of the 34^th Conference on Programming Language Design and Implementation (PLDI'13). ACM Press, New York, 27--38.

Digital Library

[9]

Lars Heinemann, Veronika Bauer, Markus Herrmannsdoerfer, and Benjamin Hummel. 2012. Identifier-based context-dependent API method recommendation. In Proceedings of the 16^th European Conference on Software Maintenance and Reengineering (CSMR'12). IEEE, 31--40.

Digital Library

[10]

Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. 2012. On the naturalness of software. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 837--847.

Digital Library

[11]

Zhenmin Li and Yuanyuan Zhou. 2005. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code. In Proceedings of the 10^th European Software Engineering Conference Held Jointly with the 13^th International Symposium on The Foundations of Software Engineering (ESEC/FSE'05). ACM Press, New York, 306--315.

Digital Library

[12]

Benjamin Livshits and Thomas Zimmermann. 2005. DynaMine: Finding common error patterns by mining software revision histories. In Proceedings of the 10^th European Software Engineering Conference Held Jointly with the 13^th International Symposium on The Foundations of Software Engineering (ESEC/FSE'05). ACM Press, New York, 296--305.

Digital Library

[13]

Robert Cecil Martin. 2003. Agile Software Development: Principles, Patterns, and Practices. Prentice Hall, PTR, Upper Saddle River, NJ.

Digital Library

[14]

Andrew Mccallum, Kamal Nigam, and Lyle H. Ungar. 2000. Efficient clustering of high-dimensional data sets with application to reference matching. In Proceedings of the 6^th International Conference on Knowledge Discovery and Data Mining (KDD'00). ACM Press, New York, 169--178.

Digital Library

[15]

Amir Michail. 2000. Data mining library reuse patterns using generalized association rules. In Proceedings of the 22^nd International Conference on Software Engineering (ICSE'00). ACM Press, New York, 167--176.

Digital Library

[16]

Martin Monperrus, Marcel Bruch, and Mira Mezini. 2010. Detecting missing method calls in object-oriented software. In Proceedings of the 24^th European Conference on Object-Oriented Programming (ECOOP'10). 2--25.

Digital Library

[17]

Meiyappan Nagappan, Thomas Zimmermann, and Christian Bird. 2013. Diversity in software engineering research. In Proceedings of the 9^th Joint Meeting of the European Software Engineering Conference and the Symposium on The Foundations of Software Engineering (ESEC/FSE'13). ACM Press, New York, 466--476.

Digital Library

[18]

Anh Tuan Nguyen, Tung Thanh Nguyen, Hoan Anh Nguyen, Ahmed Tamrawi, Hung Viet Nguyen, Jafar Al-Kofahi, and Tien N. Nguyen. 2012. Graph-based pattern-oriented, context-sensitive source code completion. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 69--79.

Digital Library

[19]

Tung Thanh Nguyen, Hoan Anh Nguyen, Nam H. Pham, Jafar M. Al-Kofahi, and Tien N. Nguyen. 2009. Graph-based mining of multiple object usage patterns. In Proceedings of the 7^th Joint Meeting of the European Software Engineering Conference and the Symposium on The Foundations of Software Engineering (ESEC/FSE'09). ACM Press, New York, 383--392.

Digital Library

[20]

Jakob Nielsen. 1994. Usability Engineering. Elsevier, Amsterdam.

[21]

Sebastian Proksch, Sven Amann, and Mira Mezini. 2014. Towards standardized evaluation of developer-assistance tools. In Proceedings of the 4^th International Workshop on Recommendation Systems for Software Engineering (RSSE'14). ACM Press, New York, 14--18.

Digital Library

[22]

Irina Rish. 2001. An empirical study of the naive Bayes classifier. In Proceedings of the Workshop on Empirical Methods in Artificial Intelligence (IJCAI'01). IBM, New York, 41--46.

[23]

Martin P. Robillard, Eric Bodden, David Kawrykow, Mira Mezini, and Tristan Ratchford. 2013. Automated API property inference techniques. IEEE Trans. Softw. Engin. 39, 5, 613--637.

Digital Library

[24]

J. Michael Schultz and Mark Liberman. 1999. Topic detection and tracking using idf-weighted cosine coefficient. In Proceedings of the DARPA Broadcast News Workshop. Morgan Kaufmann Publishers, 189--192.

[25]

Olin Shivers. 1988. Control flow analysis in scheme. In Proceedings of the Conference on Programming Language Design and Implementation (PLDI'88). ACM Press, New York, 164--174.

Digital Library

[26]

Olin Shivers. 1991a. Data-flow analysis and type recovery in scheme. In Topics in Advanced Language Implementation. The MIT Press, Cambridge, MA.

[27]

Olin Shivers. 1991b. The semantics of scheme control-flow analysis. In Proceedings of the ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation (PEPM'91). ACM Press, New York, 190--198.

Digital Library

[28]

Alexander Strehl, Joydeep Ghosh, and Raymond Mooney. 2000. Impact of similarity measures on web-page clustering. In Proceedings of the Workshop on Artificial Intelligence for Web Search (AAAI'00). 58--64.

[29]

Xiwang Yang, Yang Guo, and Yong Liu. 2011. Bayesian-inference based recommendation in online social networks. In Proceedings of the INFOCOM Conference (INFOCOM'11). 551--555.

[30]

Cheng Zhang, Juyuan Yang, Yi Zhang, Jing Fan, Xin Zhang, Jianjun Zhao, and Peizhao Ou. 2012. Automatic parameter recommendation for practical API usage. In Proceedings of the International Conference on Software Engineering (ICSE'12). IEEE Press, 826--836.

Digital Library

[31]

Hao Zhong, Lu Zhang, and Hong Mei. 2008. Inferring specifications of object oriented APIs from API source code. In Proceedings of the 15^th Asia-Pacific Software Engineering Conference (APSEC'08). IEEE Computer Society, 221--228.

Digital Library

Cited By

Liu YYin YDeng JLi WPeng Z(2024)A Combinatorial Strategy for API Completion: Deep Learning and HeuristicsElectronics10.3390/electronics1318366913:18(3669)Online publication date: 15-Sep-2024
https://doi.org/10.3390/electronics13183669
Dunay OCheng DTait AThakkar PRigby PChiu AAhmad IGanesan AMaddila CMurali VTayyebi ANagappan Nd'Amorim M(2024)Multi-line AI-Assisted Code AuthoringCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663836(150-160)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663836
Weber TBrandmaier MSchmidt AMayer S(2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3661145
Show More Cited By

Index Terms

Intelligent Code Completion with Bayesian Networks

Recommendations

Learning from examples to improve code completion systems
ESEC/FSE '09: Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering

The suggestions made by current IDE's code completion features are based exclusively on static type system of the programming language. As a result, often proposals are made which are irrelevant for a particular working context. Also, these suggestions ...
Exploring and Improving Code Completion for Test Code
ICPC '24: Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension

Code completion is an important feature in Integrated Development Environments (IDEs). These years, researchers have been making efforts for intelligent code completion. However, existing work on intelligent code completion either only considered ...
Don’t Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems
Currently, large pre-trained language models are widely applied in neural code completion systems. Though large code models significantly outperform their smaller counterparts, around 70% of displayed code completions from Github Copilot are not accepted ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 25, Issue 1

December 2015

339 pages

ISSN:1049-331X

EISSN:1557-7392

DOI:10.1145/2852270

Editor:
David S. Rosenblum
National University of Singapore, Singapore

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 December 2015

Accepted: 01 March 2015

Revised: 01 January 2015

Received: 01 February 2014

Published in TOSEM Volume 25, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

German Federal Ministry of Education and Research (BMBF)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

59
Total Citations
View Citations
1,118
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)5

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu YYin YDeng JLi WPeng Z(2024)A Combinatorial Strategy for API Completion: Deep Learning and HeuristicsElectronics10.3390/electronics1318366913:18(3669)Online publication date: 15-Sep-2024
https://doi.org/10.3390/electronics13183669
Dunay OCheng DTait AThakkar PRigby PChiu AAhmad IGanesan AMaddila CMurali VTayyebi ANagappan Nd'Amorim M(2024)Multi-line AI-Assisted Code AuthoringCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663836(150-160)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3663529.3663836
Weber TBrandmaier MSchmidt AMayer S(2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3661145
Chen SLi ZYang WLiu C(2024)DeciX: Explain Deep Learning Based Code Generation ApplicationsProceedings of the ACM on Software Engineering10.1145/36608141:FSE(2424-2446)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660814
Murali VMaddila CAhmad IBolin MCheng DGhorbani NFernandez RNagappan NRigby P(2024)AI-Assisted Code Authoring at Scale: Fine-Tuning, Deploying, and Mixed Methods EvaluationProceedings of the ACM on Software Engineering10.1145/36437741:FSE(1066-1085)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643774
Belgacem HLi XBianculli DBriand L(2024)Learning-based Relaxation of Completeness Requirements for Data Entry FormsACM Transactions on Software Engineering and Methodology10.1145/363570833:3(1-32)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3635708
Li ZLi CTang ZHuang WGe JLuo BNg VWang THu YZhang X(2024)PTM-APIRec: Leveraging Pre-trained Models of Source Code in API RecommendationACM Transactions on Software Engineering and Methodology10.1145/363274533:3(1-30)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3632745
Kong XLv ZChen CChang HLi NZhang F(2024)Code Recommendation for Schema Evolution of Mimic Storage SystemsInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402450049935:01(89-110)Online publication date: 28-Oct-2024
https://doi.org/10.1142/S0218194024500499
Ciniselli MPascarella LBavota G(2024)Deep Learning-Based Code Completion: On the Impact on Performance of Contextual Information2024 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58944.2024.00029(211-223)Online publication date: 6-Oct-2024
https://doi.org/10.1109/ICSME58944.2024.00029
Sharma TKechagia MGeorgiou STiwari RVats IMoazen HSarro F(2024)A survey on machine learning techniques applied to source codeJournal of Systems and Software10.1016/j.jss.2023.111934209:COnline publication date: 14-Mar-2024
https://dl.acm.org/doi/10.1016/j.jss.2023.111934
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents