Abstract
Pull-based development supports collaborative distributed development. It enables developers to collaborate on projects hosted on GitHub. If a developer wants to collaborate on a project, he/she will fork the repository, make modifications on the forked repository and send a pull request to the development team to ask for a merge of the code changes to the official repository. When the development team receives a pull request, the team members will review the changes and make a decision on whether to accept the changes or not. However, efficiently finding suitable pull request reviewers is a challenge. In this paper, we propose a multi-instance-based deep neural network model to recommend reviewers for pull requests. Given a pull request, our model extracts three features, which pull request title, commit message, and code change. The proposed model extracts the three features automatically from the code changes of every commit in the pull request. The features of different commits are then merged to predict the likelihood that a reviewer candidate is the appropriate reviewer. We use CNN and LSTM-network to learn features since the pull requisition and commit message feature have different structures than code change, written in a programming language. To test the effectiveness of our model, we performed a set of experiments using 43,986 pull requests extracted from 12 open-source projects. We compare our model with two baselines approaches, CoreDevRec and Majority Classes. Experiments demonstrate that our model outperforms two state-of-the-art baselines. For instance, for the TensorFlow project, our model’s accuracy in determining the appropriate reviewers is 50.80%, 74.70%, and 84.04%, respectively, in Top-1, Top-3, and Top-5 recommendation.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Balachandran V (2013) Reducing human effort and improving quality in peer code reviews using automatic static analysis and reviewer recommendation. In: 2013 35th international conference on software engineering (ICSE), IEEE, pp 931–940
Bissyandé TF, Lo D, Jiang L, Réveillere L, Klein J, Le Traon Y (2013) Got issues? Who cares about it? A large scale investigation of issue trackers from github. In: 2013 IEEE 24th international symposium on software reliability engineering (ISSRE), IEEE, pp 188–197
Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT press Cambridge
Gousios G, Pinzger M, Deursen Av (2014) An exploratory study of the pull-based software development model. In: Proceedings of the 36th international conference on software engineering, pp 345–355
Gousios G, Zaidman A, Storey MA, Van Deursen A (2015) Work practices and challenges in pull-based development: the integrator’s perspective. In: 2015 IEEE/ACM 37th IEEE international conference on software engineering, IEEE, vol 1, pp 358–368
Gu X, Zhang H, Zhang D, Kim S (2016) Deep api learning. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, pp 631–642
Hoang T, Dam HK, Kamei Y, Lo D, Ubayashi N (2019) Deepjit: an end-to-end deep learning framework for just-in-time defect prediction. In: 2019 IEEE/ACM 16th international conference on mining software repositories (MSR), IEEE, pp 34–45
Huo X, Li M, Zhou ZH, et al (2016) Learning unified features from natural and programming languages for locating buggy source code. In: IJCAI, pp 1606–1612
Jiang J, He JH, Chen XY (2015) Coredevrec: automatic core member recommendation for contribution evaluation. J Comput Sci Technol 30(5):998–1016
Jiang J, Yang Y, He J, Blanc X, Zhang L (2017) Who should comment on this pull request? analyzing attributes for more accurate commenter recommendation in pull-based development. Inf Softw Technol 84:48–62
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196
Lee JB, Ihara A, Monden A, Matsumoto Ki (2013) Patch reviewer recommendation in oss projects. In: APSEC (2), pp 1–6
Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. In: Advances in neural information processing systems, pp 2177–2185
Li HY, Shi ST, Thung F, Huo X, Xu B, Li M, Lo D (2019) Deepreview: automatic code review using deep multi-instance learning. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 318–330
de Lima Júnior ML, Soares DM, Plastino A, Murta L (2015) Developers assignment for analyzing pull requests. In: Proceedings of the 30th annual ACM symposium on applied computing, pp 1567–1572
de Lima Júnior ML, Soares DM, Plastino A, Murta L (2018) Automatic assignment of integrators to pull requests: the importance of selecting appropriate attributes. J Syst Softw 144:181–196
Manning CD, Schütze H, Raghavan P (2008) Introduction to information retrieval. Cambridge University Press, Cambridge
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML
Pagliardini M, Gupta P, Jaggi M (2017) Unsupervised learning of sentence embeddings using compositional n-gram features. arXiv preprint arXiv:170302507
Rahman MM, Roy CK, Collins JA (2016) Correct: code reviewer recommendation in github based on cross-project and technology experience. In: Proceedings of the 38th international conference on software engineering companion, pp 222–231
Soares DM, de Lima Júnior ML, Plastino A, Murta L (2018) What factors influence the reviewer assignment to pull requests? Inf Softw Technol 98:32–43
Thongtanunam P, Tantithamthavorn C, Kula RG, Yoshida N, Iida H, Matsumoto Ki (2015) Who should review my code? a file location-based code-reviewer recommendation approach for modern code review. In: 2015 IEEE 22nd international conference on software analysis, evolution, and reengineering (SANER), IEEE, pp 141–150
Tsay J, Dabbish L, Herbsleb J (2014) Influence of social and technical factors for evaluating contribution in github. In: Proceedings of the 36th international conference on Software engineering, pp 356–366
Voorhees EM et al (1999) The trec-8 question answering track report. Trec 99:77–82
Willett P (2006) The porter stemming algorithm: then and now. Program
Xia X, Lo D, Wang X, Yang X (2015) Who should review this change?: Putting text and file location analyses together for more accurate recommendations. In: 2015 IEEE international conference on software maintenance and evolution (ICSME), IEEE, pp 261–270
Yang C, Zhang X, Lb Z, Fan Q, Wang T, Yu Y, Yin G, Hm W (2018) Revrec: a two-layer reviewer recommendation algorithm in pull-based development model. J Central South Univ 25(5):1129–1143
Ye X, Fang F, Wu J, Bunescu R, Liu C (2018) Bug report classification using lstm architecture for more accurate software defect locating. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA), IEEE, pp 1438–1445
Yu Y, Wang H, Yin G, Ling CX (2014a) Reviewer recommender of pull-requests in github. In: 2014 IEEE international conference on software maintenance and evolution, IEEE, pp 609–612
Yu Y, Wang H, Yin G, Ling CX (2014b) Who should review this pull-request: reviewer recommendation to expedite crowd collaboration. In: 2014 21st Asia-Pacific software engineering conference, IEEE, vol 1, pp 335–342
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Xin Ye declares that he has no conflict of interest. Yongjie Zheng declares that he has no conflict of interest. Wajdi Mohammed Aljedaani declares that he has no conflict of interest. Mohamed Wiem Mkaouer declares that he has no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ye, X., Zheng, Y., Aljedaani, W. et al. Recommending pull request reviewers based on code changes. Soft Comput 25, 5619–5632 (2021). https://doi.org/10.1007/s00500-020-05559-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05559-3