[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3524938.3525939guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article
Free access

Graph-based, self-supervised program repair from diagnostic feedback

Published: 13 July 2020 Publication History

Abstract

We consider the problem of learning to repair programs from diagnostic feedback (e.g., compiler error messages). Program repair is challenging for two reasons: First, it requires reasoning and tracking symbols across source code and diagnostic feedback. Second, labeled datasets available for program repair are relatively small. In this work, we propose novel solutions to these two challenges. First, we introduce a program-feedback graph, which connects symbols relevant to program repair in source code and diagnostic feedback, and then apply a graph neural network on top to model the reasoning process. Second, we present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online to create a large amount of extra program repair examples, which we use to pre-train our models. We evaluate our proposed approach on two applications: correcting introductory programming assignments (DeepFix dataset) and correcting the outputs of program synthesis (SPoC dataset). Our final system, DrRepair, significantly outperforms prior work, achieving 68.2% full repair rate on DeepFix (+22.9% over the prior best), and 48.4% synthesis success rate on SPoC (+3.7% over the prior best).

References

[1]
Ahmed, U. Z., Kumar, P., Karkare, A., Kar, P., and Gulwani, S. Compilation error repair: for the student programs, from the student programs. In ICSE, 2018.
[2]
Allamanis, M., Brockschmidt, M., and Khademi, M. Learning to represent programs with graphs. In ICLR, 2018.
[3]
Bader, J., Scott, A., Pradel, M., and Chandra, S. Getafix: Learning to fix bugs automatically. In OOPSLA, 2019.
[4]
Brockschmidt, M., Allamanis, M., and Gaunt, A. Generative code modeling with graphs. In ICLR, 2019.
[5]
Chen, Z., Kommrusch, S. J., Tufano, M., Pouchet, L.- N., Poshyvanyk, D., and Monperrus, M. Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering, 2019.
[6]
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL, 2019.
[7]
Dinella, E., Dai, H., Li, Z., Naik, M., Song, L., and Wang, K. Hoppity: Learning graph transformations to detect and fix bugs in programs. In ICLR, 2020.
[8]
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., and Bengio, S. Why does unsupervised pretraining help deep learning? In Journal of Machine Learning Research, 2010.
[9]
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., and Zhou, M. Codebert: A pre-trained model for programming and natural languages. 2020.
[10]
Fitzgerald, S., Lewandowski, G., McCauley, R., Murphy, L., Simon, B., Thomas, L., and Zander, C. Debugging: finding, fixing and flailing, a multi-institutional study of novice debuggers. Computer Science Education, 2008.
[11]
Gupta, R., Pal, S., Kanade, A., and Shevade, S. Deepfix: Fixing common c language errors by deep learning. In AAAI, 2017.
[12]
Gupta, R., Kanade, A., and Shevade, S. Neural attribution for semantic bug-localization in student programs. In NeurIPS, 2019a.
[13]
Gupta, R., Kanade, A., and Shevade, S. Deep reinforcement learning for programming language correction. In AAAI, 2019b.
[14]
Hajipour, H., Bhattacharya, A., and Fritz, M. Samplefix: Learning to correct programs by sampling diverse fixes. In arXiv:1906.10502, 2019.
[15]
Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural Computation, 9(8):1735-1780, 1997.
[16]
Hu, W., Liu, B., Gomes, J., Zitnik, M., Liang, P., Pande, V., and Leskovec, J. Pre-training graph neural networks. In ICLR, 2020.
[17]
Just, R., Jalali, D., and Ernst, M. D. Defects4j: A database of existing faults to enable controlled testing studies for java programs. In ISSTA, 2014.
[18]
Kingma, D. and Ba, J. Adam: A method for stochastic optimization. In ICLR, 2015.
[19]
Kipf, T. N. and Welling, M. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
[20]
Kulal, S., Pasupat, P., Chandra, K., Lee, M., Padon, O., Aiken, A., and Liang, P. Spoc: Search-based pseudocode to code. In NeurIPS, 2019.
[21]
Liu, B., Tür, G., Hakkani-Tür, D., Shah, P., and Heck, L. Dialogue learning with human teaching and feedback in end-to-end trainable task-oriented dialogue systems. In NAACL, 2018.
[22]
Mesbah, A., Rice, A., Johnston, E., Glorioso, N., and Aftandilian, E. Learning to repair compilation errors. In ESEC/FSE, 2019.
[23]
Monperrus, M. The living review on automated program repair. Technical Report hal-01956501. HAL/archivesouvertes. fr., 2018.
[24]
Parihar, S., Dadachanji, Z., Praveen Kumar Singh, R. D., Karkare, A., and Bhattacharya, A. Automatic grading and feedback using program repair for introductory programming courses. In ACM Conference on Innovation and Technology in Computer Science Education, 2017.
[25]
Pascanu, R., Mikolov, T., and Bengio, Y. On the difficulty of training recurrent neural networks. arXiv:1211.5063, 2012.
[26]
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. Deep contextualized word representations. In NAACL, 2018.
[27]
Pradel, M. and Sen, K. Deepbugs: a learning approach to name-based bug detection. In OOPSLA, 2018.
[28]
Pu, Y., Narasimhan, K., Solar-Lezama, A., and Barzilay, R. sk p: a neural program corrector for moocs. In SPLASH Companion, 2016.
[29]
See, A., Liu, P. J., and Manning, C. D. Get to the point: Summarization with pointer-generator networks. In ACL, 2017.
[30]
Seo, H., Sadowski, C., Elbaum, S., Aftandilian, E., and Bowdidge, R. Programmers' build errors: A case study at google. In ICSE, 2014.
[31]
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. JMLR, 2014.
[32]
Tarlow, D., Moitra, S., Rice, A., Chen, Z., Manzagol, P.-A., Sutton, C., and Aftandilian, E. Learning to fix build errors with graph2diff neural networks. In arXiv:1911.01205, 2019.
[33]
Vasic, M., Kanade, A., Maniatis, P., Bieber, D., and Singh, R. Neural program repair by jointly learning to localize and repair. In ICLR, 2019.
[34]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. In NeurIPS, 2017.
[35]
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., and Bengio, Y. Graph attention networks. In ICLR, 2018.
[36]
Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A. Extracting and composing robust features with denoising autoencoders. In ICML, 2008.
[37]
Wang, K., Singh, R., and Su, Z. Dynamic neural program embeddings for program repair. In ICLR, 2018.
[38]
Xu, K., Hu, W., Leskovec, J., and Jegelka, S. How powerful are graph neural networks? In ICLR, 2019.
[39]
Yasunaga, M., Zhang, R., Meelu, K., Pareek, A., Srinivasan, K., and Radev, D. R. Graph-based neural multidocument summarization. In CoNLL, 2017.
[40]
Zhang, Y., Qi, P., and Manning, C. D. Graph convolution over pruned dependency trees improves relation extraction. In EMNLP, 2018.
[41]
Zhao, R., Bieber, D., Swersky, K., and Tarlow, D. Neural networks for modeling source code edits. In arXiv:1904.02818, 2019.
[42]
Zhong, R., Stern, M., and Klein, D. Semantic scaffolds for pseudocode-to-code generation. arXiv:2005.05927, 2020.

Cited By

View all
  • (2024)Rust-lancet: Automated Ownership-Rule-Violation Fixing with Behavior PreservationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639103(1-13)Online publication date: 20-May-2024
  • (2023)Contextual Predictive Mutation TestingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616289(250-261)Online publication date: 30-Nov-2023
  • (2023)LARCH: Large Language Model-based Automatic Readme Creation with HeuristicsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614744(5066-5070)Online publication date: 21-Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML'20: Proceedings of the 37th International Conference on Machine Learning
July 2020
11702 pages

Publisher

JMLR.org

Publication History

Published: 13 July 2020

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)12
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Rust-lancet: Automated Ownership-Rule-Violation Fixing with Behavior PreservationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639103(1-13)Online publication date: 20-May-2024
  • (2023)Contextual Predictive Mutation TestingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616289(250-261)Online publication date: 30-Nov-2023
  • (2023)LARCH: Large Language Model-based Automatic Readme Creation with HeuristicsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614744(5066-5070)Online publication date: 21-Oct-2023
  • (2023)Advances in Automated Pedagogical Compile-time Error RepairProceedings of the 16th Innovations in Software Engineering Conference10.1145/3578527.3578535(1-11)Online publication date: 23-Feb-2023
  • (2023)PRIORITY: An Intelligent Problem Indicator RepositoryProceedings of the 16th Innovations in Software Engineering Conference10.1145/3578527.3578533(1-10)Online publication date: 23-Feb-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media