More Web Proxy on the site http://driver.im/

research-article

Free access

Graph-based, self-supervised program repair from diagnostic feedback

AUTHORs:

Michihiro Yasunaga,

Percy LiangAuthors Info & Claims

ICML'20: Proceedings of the 37th International Conference on Machine Learning

Article No.: 1001, Pages 10799 - 10808

Published: 13 July 2020 Publication History

PDF eReader Publisher Site

Abstract

We consider the problem of learning to repair programs from diagnostic feedback (e.g., compiler error messages). Program repair is challenging for two reasons: First, it requires reasoning and tracking symbols across source code and diagnostic feedback. Second, labeled datasets available for program repair are relatively small. In this work, we propose novel solutions to these two challenges. First, we introduce a program-feedback graph, which connects symbols relevant to program repair in source code and diagnostic feedback, and then apply a graph neural network on top to model the reasoning process. Second, we present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online to create a large amount of extra program repair examples, which we use to pre-train our models. We evaluate our proposed approach on two applications: correcting introductory programming assignments (DeepFix dataset) and correcting the outputs of program synthesis (SPoC dataset). Our final system, DrRepair, significantly outperforms prior work, achieving 68.2% full repair rate on DeepFix (+22.9% over the prior best), and 48.4% synthesis success rate on SPoC (+3.7% over the prior best).

References

[1]

Ahmed, U. Z., Kumar, P., Karkare, A., Kar, P., and Gulwani, S. Compilation error repair: for the student programs, from the student programs. In ICSE, 2018.

Digital Library

[2]

Allamanis, M., Brockschmidt, M., and Khademi, M. Learning to represent programs with graphs. In ICLR, 2018.

[3]

Bader, J., Scott, A., Pradel, M., and Chandra, S. Getafix: Learning to fix bugs automatically. In OOPSLA, 2019.

Digital Library

[4]

Brockschmidt, M., Allamanis, M., and Gaunt, A. Generative code modeling with graphs. In ICLR, 2019.

[5]

Chen, Z., Kommrusch, S. J., Tufano, M., Pouchet, L.- N., Poshyvanyk, D., and Monperrus, M. Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering, 2019.

[6]

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL, 2019.

[7]

Dinella, E., Dai, H., Li, Z., Naik, M., Song, L., and Wang, K. Hoppity: Learning graph transformations to detect and fix bugs in programs. In ICLR, 2020.

[8]

Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., and Bengio, S. Why does unsupervised pretraining help deep learning? In Journal of Machine Learning Research, 2010.

[9]

Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., and Zhou, M. Codebert: A pre-trained model for programming and natural languages. 2020.

[10]

Fitzgerald, S., Lewandowski, G., McCauley, R., Murphy, L., Simon, B., Thomas, L., and Zander, C. Debugging: finding, fixing and flailing, a multi-institutional study of novice debuggers. Computer Science Education, 2008.

[11]

Gupta, R., Pal, S., Kanade, A., and Shevade, S. Deepfix: Fixing common c language errors by deep learning. In AAAI, 2017.

[12]

Gupta, R., Kanade, A., and Shevade, S. Neural attribution for semantic bug-localization in student programs. In NeurIPS, 2019a.

[13]

Gupta, R., Kanade, A., and Shevade, S. Deep reinforcement learning for programming language correction. In AAAI, 2019b.

[14]

Hajipour, H., Bhattacharya, A., and Fritz, M. Samplefix: Learning to correct programs by sampling diverse fixes. In arXiv:1906.10502, 2019.

[15]

Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural Computation, 9(8):1735-1780, 1997.

Digital Library

[16]

Hu, W., Liu, B., Gomes, J., Zitnik, M., Liang, P., Pande, V., and Leskovec, J. Pre-training graph neural networks. In ICLR, 2020.

[17]

Just, R., Jalali, D., and Ernst, M. D. Defects4j: A database of existing faults to enable controlled testing studies for java programs. In ISSTA, 2014.

Digital Library

[18]

Kingma, D. and Ba, J. Adam: A method for stochastic optimization. In ICLR, 2015.

[19]

Kipf, T. N. and Welling, M. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.

[20]

Kulal, S., Pasupat, P., Chandra, K., Lee, M., Padon, O., Aiken, A., and Liang, P. Spoc: Search-based pseudocode to code. In NeurIPS, 2019.

[21]

Liu, B., Tür, G., Hakkani-Tür, D., Shah, P., and Heck, L. Dialogue learning with human teaching and feedback in end-to-end trainable task-oriented dialogue systems. In NAACL, 2018.

[22]

Mesbah, A., Rice, A., Johnston, E., Glorioso, N., and Aftandilian, E. Learning to repair compilation errors. In ESEC/FSE, 2019.

[23]

Monperrus, M. The living review on automated program repair. Technical Report hal-01956501. HAL/archivesouvertes. fr., 2018.

[24]

Parihar, S., Dadachanji, Z., Praveen Kumar Singh, R. D., Karkare, A., and Bhattacharya, A. Automatic grading and feedback using program repair for introductory programming courses. In ACM Conference on Innovation and Technology in Computer Science Education, 2017.

Digital Library

[25]

Pascanu, R., Mikolov, T., and Bengio, Y. On the difficulty of training recurrent neural networks. arXiv:1211.5063, 2012.

[26]

Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. Deep contextualized word representations. In NAACL, 2018.

[27]

Pradel, M. and Sen, K. Deepbugs: a learning approach to name-based bug detection. In OOPSLA, 2018.

Digital Library

[28]

Pu, Y., Narasimhan, K., Solar-Lezama, A., and Barzilay, R. sk p: a neural program corrector for moocs. In SPLASH Companion, 2016.

[29]

See, A., Liu, P. J., and Manning, C. D. Get to the point: Summarization with pointer-generator networks. In ACL, 2017.

[30]

Seo, H., Sadowski, C., Elbaum, S., Aftandilian, E., and Bowdidge, R. Programmers' build errors: A case study at google. In ICSE, 2014.

Digital Library

[31]

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. JMLR, 2014.

Digital Library

[32]

Tarlow, D., Moitra, S., Rice, A., Chen, Z., Manzagol, P.-A., Sutton, C., and Aftandilian, E. Learning to fix build errors with graph2diff neural networks. In arXiv:1911.01205, 2019.

[33]

Vasic, M., Kanade, A., Maniatis, P., Bieber, D., and Singh, R. Neural program repair by jointly learning to localize and repair. In ICLR, 2019.

[34]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. In NeurIPS, 2017.

Digital Library

[35]

Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., and Bengio, Y. Graph attention networks. In ICLR, 2018.

[36]

Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A. Extracting and composing robust features with denoising autoencoders. In ICML, 2008.

Digital Library

[37]

Wang, K., Singh, R., and Su, Z. Dynamic neural program embeddings for program repair. In ICLR, 2018.

[38]

Xu, K., Hu, W., Leskovec, J., and Jegelka, S. How powerful are graph neural networks? In ICLR, 2019.

[39]

Yasunaga, M., Zhang, R., Meelu, K., Pareek, A., Srinivasan, K., and Radev, D. R. Graph-based neural multidocument summarization. In CoNLL, 2017.

[40]

Zhang, Y., Qi, P., and Manning, C. D. Graph convolution over pruned dependency trees improves relation extraction. In EMNLP, 2018.

[41]

Zhao, R., Bieber, D., Swersky, K., and Tarlow, D. Neural networks for modeling source code edits. In arXiv:1904.02818, 2019.

[42]

Zhong, R., Stern, M., and Klein, D. Semantic scaffolds for pseudocode-to-code generation. arXiv:2005.05927, 2020.

Cited By

Yang WSong LXue YRoychoudhury APaiva AAbreu RStorey M(2024)Rust-lancet: Automated Ownership-Rule-Violation Fixing with Behavior PreservationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639103(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639103
Jain KAlon UGroce ALe Goues CChandra SBlincoe KTonella P(2023)Contextual Predictive Mutation TestingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616289(250-261)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616289
Koreeda YMorishita TImaichi OSogawa YFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)LARCH: Large Language Model-based Automatic Readme Creation with HeuristicsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614744(5066-5070)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614744
Show More Cited By

Index Terms

Graph-based, self-supervised program repair from diagnostic feedback

Index terms have been assigned to the content through auto-classification.

Recommendations

SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics
ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Learning-based program repair has achieved good results in a recent series of papers. Yet, we observe that the related work fails to repair some bugs because of a lack of knowledge about 1) the application domain of the program being repaired, and 2) ...
A debiased self-training framework with graph self-supervised pre-training aided for semi-supervised rumor detection
Abstract
Existing rumor detection models have achieved remarkable performance in fully-supervised settings. However, it is time-consuming and labor-intensive to obtain extensive labeled rumor data. To mitigate the reliance on labeled data, semi-supervised ...
Highlights
- A self-training framework for semi-supervised rumor detection is proposed.
- Graph self-supervised pre-training is employed to alleviate confirmation bias.
- Self-adaptive thresholds are designed to generate reliable pseudo-labels.
JGCL: Joint Self-Supervised and Supervised Graph Contrastive Learning
WWW '22: Companion Proceedings of the Web Conference 2022

Semi-supervised and self-supervised learning on graphs are two popular avenues for graph representation learning. We demonstrate that no single method from semi-supervised and self-supervised learning works uniformly well for all settings in the node ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

ICML'20: Proceedings of the 37th International Conference on Machine Learning

July 2020

11702 pages

Editors:
Hal Daumé,
Aarti Singh

Copyright © 2020.

Publisher

JMLR.org

Publication History

Published: 13 July 2020

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
106
Total Downloads

Downloads (Last 12 months)68
Downloads (Last 6 weeks)12

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang WSong LXue YRoychoudhury APaiva AAbreu RStorey M(2024)Rust-lancet: Automated Ownership-Rule-Violation Fixing with Behavior PreservationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639103(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639103
Jain KAlon UGroce ALe Goues CChandra SBlincoe KTonella P(2023)Contextual Predictive Mutation TestingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616289(250-261)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616289
Koreeda YMorishita TImaichi OSogawa YFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)LARCH: Large Language Model-based Automatic Readme Creation with HeuristicsProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614744(5066-5070)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614744
H. Padmanabha SShaikh FBansal MChatterjee DSingh PKarkare AKar P(2023)Advances in Automated Pedagogical Compile-time Error RepairProceedings of the 16th Innovations in Software Engineering Conference10.1145/3578527.3578535(1-11)Online publication date: 23-Feb-2023
https://dl.acm.org/doi/10.1145/3578527.3578535
H. Padmanabha SShaikh FBansal MChatterjee DSingh PKarkare AKar P(2023)PRIORITY: An Intelligent Problem Indicator RepositoryProceedings of the 16th Innovations in Software Engineering Conference10.1145/3578527.3578533(1-10)Online publication date: 23-Feb-2023
https://dl.acm.org/doi/10.1145/3578527.3578533

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten