[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3196321.3196326acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Unsupervised deep bug report summarization

Published: 28 May 2018 Publication History

Abstract

Bug report summarization is an effective way to reduce the considerable time in wading through numerous bug reports. Although some supervised and unsupervised algorithms have been proposed for this task, their performance is still limited, due to the particular characteristics of bug reports, including the evaluation behaviours in bug reports, the diverse sentences in software language and natural language, and the domain-specific predefined fields. In this study, we conduct the first exploration of the deep learning network on bug report summarization. Our approach, called DeepSum, is a novel stepped auto-encoder network with evaluation enhancement and predefined fields enhancement modules, which successfully integrates the bug report characteristics into a deep neural network. DeepSum is unsupervised. It significantly reduces the efforts on labeling huge training sets. Extensive experiments show that DeepSum outperforms the comparative algorithms by up to 13.2% and 9.2% in terms of F-score and Rouge-n metrics respectively over the public datasets, and achieves the state-of-the-art performance. Our work shows promising prospects for deep learning to summarize millions of bug reports.

References

[1]
Tensorflow an open-source software library for Machine Intelligence. 2017. https://www.tensorflow.org/. (2017).
[2]
Xuân Baldauf. 2005. Converting image from grayscale to black&white is painfully slow. https://bugzilla.gnome.org/show_bug.cgi?id=170801. (2005).
[3]
Nicolas Bettenburg, Rahul Premraj, Sunghun Kim, and Thomas Zimmermann. 2008. Extracting structural information from bug reports. In Proceedings of the International Working Conference on Mining Software Repositories (MSR'08). ACM, 27--30.
[4]
Nicolas Bettenburg, Rahul Premraj, Thomas Zimmermann, and Sunghun Kim. 2008. Duplicate bug reports considered harmful really?. In IEEE International Conference on Software Maintenance (ICSM'08). IEEE, 337--345.
[5]
Ziqiang Cao, Furu Wei, Li Dong, Sujian Li, and Ming Zhou. 2015. Ranking with recursive neural networks and its application to multi-document summarization. In AAAI Conference on Artificial Intelligence (AAAI'12). 2153--2159.
[6]
Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 335--336.
[7]
Debian. 2016. Introduction to the bug control and manipulation mailserver. http://www.debian.org/Bugs/server-control#summary. (2016).
[8]
Jayati Deshmukh, Annervaz K M, Sanjay Podder, Shubhashis Sengupta, and Neville Dubash. 2017. Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques. In IEEE International Conference on Software Maintenance and Evolution (ICSME'17).
[9]
Damian Doyle. 2017. Default English stopwords list. http://www.ranks.nl/stopwords. (2017).
[10]
Laura V Galvis Carreño and Kristina Winbladh. 2013. Analysis of user comments: an approach for software requirements evolution. In Proceedings of the 2013 International Conference on Software Engineering (ICSE'13). IEEE Press, 582--591.
[11]
Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, and Sunghun Kim. 2016. Deep API learning. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE'16). ACM, 631--642.
[12]
Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing Common C Language Errors by Deep Learning. In AAAI Conference on Artificial Intelligence (AAAI'17). 1345--1351.
[13]
Geoffrey Hinton and Tijmen Tieleman. 2012. Lecture 6.5 - RMSProp, COURSERA: Neural Networks for Machine Learning. (2012).
[14]
Sture Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian journal of statistics (1979), 65--70.
[15]
He Jiang, Xiaochen Li, Zijiang Yang, and Jifeng Xuan. 2017. What causes my test alarm? automatic cause analysis for test alarms in system and integration testing. In Proceedings of the 39th International Conference on Software Engineering (ICSE'17). 712--723.
[16]
He Jiang, Jingxuan Zhang, Xiaochen Li, Zhilei Ren, and David Lo. 2016. A more accurate model for finding tutorial segments explaining APIs. In IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER'16), Vol. 1. IEEE, 157--167.
[17]
He Jiang, Jingxuan Zhang, Hhongjing Ma, Nazar Najam, and Zhilei Ren. 2017. Mining authorship characteristics in bug repositories. Science China Informaction Science 58 (2017).
[18]
Sunghun Kim, Kai Pan, and EE Whitehead Jr. 2006. Memories of bug fixes. In Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE'06). ACM, 35--45.
[19]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NIPS'12).
[20]
AnNgoc Lam, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N Nguyen. 2017. Bug localization with combination of deep learning and information retrieval. In Proceedings of the 25th International Conference on Program Comprehension (ICPC'17). IEEE Press, 218--229.
[21]
An Ngoc Lam, Anh Tuan Nguyen, Hoan Anh Nguyen, and Tien N Nguyen. 2015. Combining deep learning with information retrieval to localize buggy files for bug reports (n). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE'15). IEEE, 476--481.
[22]
Jian Li, Pinjia He, Jieming Zhu, and Michael R Lyu. 2017. Software Defect Prediction via Convolutional Neural Network. In IEEE International Conference on Software Quality, Reliability and Security (QRS'17). IEEE, 318--328.
[23]
Chin-Yew Lin. 2004. Rouge: a package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop, Vol. 8. Barcelona, Spain.
[24]
Bugzilla Installation List. 2017. https://www.bugzilla.org/installation-list/. (2017).
[25]
Chang Liu, Xinyun Chen, Eui Chul Shin, Mingcheng Chen, and Dawn Song. 2016. Latent attention for if-then program synthesis. In Advances in Neural Information Processing Systems (NIPS'16). 4574--4582.
[26]
Yan Liu, Sheng-hua Zhong, and Wenjie Li. 2012. Query-oriented multi-document summarization via unsupervised deep learning. In AAAI Conference on Artificial Intelligence (AAAI'12).
[27]
Rafael Lotufo, Zeeshan Malik, and Krzysztof Czarnecki. 2015. Modelling the Hurried bug report reading process to summarize bug reports. Empirical Software Engineering 20, 2 (2015), 516--548.
[28]
Apache Lucene. 2016. http://lucene.apache.org/. (2016).
[29]
Senthil Mani, Rose Catherine, Vibha Singhal Sinha, and Avinava Dubey. 2012. Ausum: approach for unsupervised bug report summarization. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering (FSE'12). ACM, 11.
[30]
Qiaozhu Mei, Jian Guo, and Dragomir Radev. 2010. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD'10). ACM, 1009--1018.
[31]
Hajime Morita, Ryohei Sasano, Hiroya Takamura, and Manabu Okumura. 2013. Subtree extractive summarization via submodular maximization. In Annual Meeting of the Association for Computational Linguistics (ACL'13). Citeseer, 1023--1032.
[32]
Lili Mou, Ge Li, lu Zhang, Tao Wang, and Zhi Jin. 2016. Convolutional Neural Networks over Tree Structures for Programming Language Processing. In AAAI Conference on Artificial Intelligence (AAAI'16). AAAI Press, 1287--1293.
[33]
Mozilla. 2013. Bug writing guidelines. https://developer.mozilla. org/en-US/docs/Mozilla/QA/Bug_writing_guidelines. (2013).
[34]
Karolina Owczarzak, John M Conroy, Hoa Trang Dang, and Ani Nenkova. 2012. An assessment of the accuracy of automatic evaluation in summarization. In Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization. ACL, 1--9.
[35]
Hao Peng, Lili Mou, Ge Li, Yuxuan Liu, Lu Zhang, and Zhi Jin. 2015. Building program vector representations for deep learning. In International Conference on Knowledge Science, Engineering and Management. Springer, 547--553.
[36]
Martin F Porter. 1980. An algorithm for suffix stripping. Program 14, 3 (1980), 130--137.
[37]
Dragomir R Radev, Hongyan Jing, Malgorzata Styś, and Daniel Tam. 2004. Centroid-based summarization of multiple documents. Information Processing & Management 40, 6 (2004), 919--938.
[38]
Sarah Rastkar, Gail C Murphy, and Gabriel Murray. 2014. Automatic summarization of bug reports. IEEE Transactions on Software Engineering (TSE'14) 40, 4 (2014), 366--380.
[39]
Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In ACM SIGPLAN Notices, Vol. 49. ACM, 419--428.
[40]
Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958.
[41]
Yuan Tian, David Lo, and Chengnian Sun. 2013. Drone: Predicting priority of reported bugs by multi-factor analysis. In IEEE International Conference on Software Maintenance (ICSM'13). IEEE, 200--209.
[42]
Tjekkles. 2011. Java: Open a file (Windows + Mac). https://stackoverflow.com/questions/7024031/. (2011).
[43]
Paolo Toth. 1980. Dynamic programming algorithms for the zero-one knapsack problem. Computing 25, 1 (1980), 29--45.
[44]
Song Wang, Taiyue Liu, and Lin Tan. 2016. Automatically learning semantic features for defect prediction. In Proceedings of the 38th International Conference on Software Engineering (ICSE'16). ACM, 297--308.
[45]
Martin White, Christopher Vendome, Mario Linares-Vásquez, and Denys Poshyvanyk. 2015. Toward deep learning software repositories. In Proceedings of the International Working Conference on Mining Software Repositories (MSR'15). IEEE, 334--345.
[46]
Fen Xia, Tie Yan Liu, Jue Wang, Hang Li, and Hang Li. 2008. Listwise approach to learning to rank: theory and algorithm. In International Conference on Machine Learning. 1192--1199.
[47]
Xin Xia, David Lo, Emad Shihab, and Xinyu Wang. 2016. Automated bug report field reassignment and refinement prediction. IEEE Transactions on Reliability 65, 3 (2016), 1094--1113.
[48]
Bowen Xu, Deheng Ye, Zhenchang Xing, Xin Xia, Guibin Chen, and Shanping Li. 2016. Predicting semantically linkable knowledge in developer online forums via convolutional neural network. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE'16). ACM, 51--62.
[49]
Jifeng Xuan, He Jiang, Yan Hu, Zhilei Ren, Weiqin Zou, Zhongxuan Luo, and Xindong Wu. 2015. Towards effective bug triage with software data reduction techniques. IEEE Transactions on Knowledge and Data Engineering (TKDE'15) 27, 1 (2015), 264--280.
[50]
Xinli Yang, David Lo, Xin Xia, Yun Zhang, and Jianling Sun. 2015. Deep learning for just-in-time defect prediction. In IEEE International Conference on Software Quality, Reliability and Security (QRS'15). IEEE, 17--26.
[51]
Xiaojin Zhu, Andrew B Goldberg, Jurgen Van Gael, and David Andrzejewski. 2007. Improving diversity in ranking using absorbing random walks. In Proceedings of NAACL HLT. 97--104.
[52]
Thomas Zimmermann, Rahul Premraj, Nicolas Bettenburg, Sascha Just, Adrian Schröter, and Cathrin Weiss. 2010. What makes a good bug report? IEEE Transactions on Software Engineering (TSE'10) 36, 5 (2010), 618--643.

Cited By

View all
  • (2024)Deep learning-based software engineering: progress, challenges, and opportunitiesScience China Information Sciences10.1007/s11432-023-4127-568:1Online publication date: 24-Dec-2024
  • (2024)KeyTitle: towards better bug report title generation by keywords planningSoftware Quality Journal10.1007/s11219-024-09695-z32:4(1655-1682)Online publication date: 13-Sep-2024
  • (2023)Predicting the Change Impact of Resolving Defects by Leveraging the Topics of Issue Reports in Open Source Software SystemsACM Transactions on Software Engineering and Methodology10.1145/359380232:6(1-34)Online publication date: 30-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPC '18: Proceedings of the 26th Conference on Program Comprehension
May 2018
423 pages
ISBN:9781450357142
DOI:10.1145/3196321
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bug report summarization
  2. deep learning
  3. mining software repositories
  4. unsupervised learning

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '18
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)2
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Deep learning-based software engineering: progress, challenges, and opportunitiesScience China Information Sciences10.1007/s11432-023-4127-568:1Online publication date: 24-Dec-2024
  • (2024)KeyTitle: towards better bug report title generation by keywords planningSoftware Quality Journal10.1007/s11219-024-09695-z32:4(1655-1682)Online publication date: 13-Sep-2024
  • (2023)Predicting the Change Impact of Resolving Defects by Leveraging the Topics of Issue Reports in Open Source Software SystemsACM Transactions on Software Engineering and Methodology10.1145/359380232:6(1-34)Online publication date: 30-Sep-2023
  • (2023)Incident-Aware Duplicate Ticket Aggregation for Cloud SystemsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00193(2299-2311)Online publication date: 14-May-2023
  • (2023)Automated Summarization of Stack Overflow PostsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00158(1853-1865)Online publication date: 14-May-2023
  • (2023)RepresentThemAll: A Universal Learning Representation of Bug ReportsProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00060(602-614)Online publication date: 14-May-2023
  • (2023)Deep Learning in Requirement Engineering: A Statistical Justification2023 International Conference on New Frontiers in Communication, Automation, Management and Security (ICCAMS)10.1109/ICCAMS60113.2023.10525759(1-8)Online publication date: 27-Oct-2023
  • (2023)A first look at bug report templates on GitHubJournal of Systems and Software10.1016/j.jss.2023.111709202:COnline publication date: 1-Aug-2023
  • (2022)Deep Learning-Based Bug Report Summarization Using Sentence Significance FactorsApplied Sciences10.3390/app1212585412:12(5854)Online publication date: 8-Jun-2022
  • (2022)Bug report summarization using multi-view multi-objective optimization frameworkProceedings of the Genetic and Evolutionary Computation Conference10.1145/3512290.3528843(1245-1253)Online publication date: 8-Jul-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media