[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Query expansion via learning change sequences

Published: 01 January 2020 Publication History

Abstract

Proksch has proved the changed terms of source code negatively affect code search quality. However, current query expansion (QE) methods always ignore it. In this paper we propose a novel QE method based on the semantics of change sequences (QESC). It not only captures which changes occurred by extracting change sequences from Github commits, but also understands why changes occurred by learning sequence semantics with Deep Belief Network (DBN). Thus it could extract relevant terms to expand or irrelevant terms to exclude from the changes semantically similar to a query. Our experimental results show QESC outperforms the existing QE methods by 15–23% in terms of precision on inspecting the first query result.

References

[1]
F. Lv, H.Y. Zhang, J.G. Lou, S.W. Wang, D.M. Zhang and J.J. Zhao, CodeHow: Effective code search based on API Understanding and extended boolean model (E), in: Proc 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), Lincoln, NE, USA, (2015), 260–270.
[2]
L. Nie, H. Jiang, Z. Ren, Z. Sun and X. Li, Query expansion based on crowd knowledge for code search, IEEE Transactions on Services Computing, (2016), 771–783.
[3]
S. Proksch, S. Amann, S. Nadi and M. Mezimi, Evaluating the evaluations of code recommender systems: A reality check, in: Proc of 31st IEEE/ACM International Conference on Automated Software Engineering, Singapore, (2016), 111–121.
[4]
M. White and C. Vendome, M. Linares-Vàsquez and D. Poshyvanyk, Toward deep learning software repositories, in: Proc the 12th Working Conference on Mining Software Repositories, Florence, Italy, (2015), 334–345.
[5]
X. Sun, X. Liu, J. Hu and J. Zhu, Empirical studies on the NLP techniques for source code data preprocessing, in: Proc 3rd International Workshop on Evidential Assessment of Software Technologies, Nanjing, China, (2014), 32–39.
[6]
C.D. Manning, P. Raghavan and H. Schtze, Introduction to information retrieval, Cambridge University Press, 2008.
[7]
B. Fluri, M. Wursch, M. Pinzger and H.C. Gall, Change distilling – tree differencing for fine-grained source code change extraction, IEEE Transactions on Software Engineering SE-33(11) (2007).
[8]
G.E. Hinton and R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks, American Association for the Advancement of Science, (2006), 504–507.
[9]
G.E. Hinton, S. Osindero and Yee Whye The, A fast learning algorithm for deep belief nets, Neural Computation (2006), 1527–1554.
[10]
S. Haiduc, G. Bavota, A. Marcus, R. Oliveto, A.D. Lucia and T. Menzies, Automatic query reformulations for text retrieval in software engineering, in: Proc (2013) International Conference on Software Engineering, San Francisco, CA, USA (2013), 842–851.
[11]
I.H. Witten and E. Frank, Data mining: Practical machine learning tools and techniques, Morgan Kaufmann Series, (2005).
[12]
S. Proksch, S. Amann, S. Nadi and M. Mezimi, Evaluating the evaluations of code recommender systems: A reality check, in: Proc of 31st IEEE/ACM International Conference on Automated Software Engineering, Singapore, (2016), 111–121.
[13]
D. Ciresan, U. Meier and J. Schmidhuber, Multi-column deep neural networks for image classification, in: Proc (2012) IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Washington, DC, USA, (2012), 3642–3649.
[14]
A. Krizhevsky, I. Sutskever and G.E. Hinton, Imagenet classification with deep convolutional neural networks, Communications of the ACM, (2017), 84–90.
[15]
A.-R. Mohamed, G.E. Dahl and G. Hinton, Acoustic modeling using deep belief networks, IEEE Transactions on Audio, Speech, and Language Processing, (2012), 14–22.
[16]
Q. Huang, Y. Yang, X. Zhan, H. Wan and G. Wu, Query expansion based on statistical learning from code changes, Softw Pract Exper (2018), 1–19.
[17]
V. Raychev, M.T. Vechev, Eran Yahav, code completion with statistical language models, in: Proc the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, Edinburgh, UK (2014), 419–428.
[18]
A. Krizhevsky, I. Sutskever and G.E. Hinton, Imagenet classification with deep convolutional neural networks, 25th International Conference on Neural Information Processing Systems (2012), 1097–1105.
[19]
R. Razavi-Far, E. Hallaji and M. Farajzadeh-Zanjani, Information fusion and semi-supervised deep learning scheme for diagnosing gear faults in induction machine systems, IEEE Transactions on Industrial Electronics (2018).
[20]
F.A. Gers and E. Schmidhuber, LSTM recurrent networks learn simple context-free and context-sensitive languages, IEEE Transactions on Neural Networks 12(6) (2001), 1333–1340.

Cited By

View all
  • (2023)Big Code Search: A BibliographyACM Computing Surveys10.1145/360490556:1(1-49)Online publication date: 26-Aug-2023

Index Terms

  1. Query expansion via learning change sequences
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image International Journal of Knowledge-based and Intelligent Engineering Systems
          International Journal of Knowledge-based and Intelligent Engineering Systems  Volume 24, Issue 2
          2020
          93 pages

          Publisher

          IOS Press

          Netherlands

          Publication History

          Published: 01 January 2020

          Author Tags

          1. query expansion
          2. change sequence
          3. DBN

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 24 Jan 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2023)Big Code Search: A BibliographyACM Computing Surveys10.1145/360490556:1(1-49)Online publication date: 26-Aug-2023

          View Options

          View options

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media