[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

An empirical study on the potential of word embedding techniques in bug report management tasks

Published: 25 July 2024 Publication History

Abstract

Context

Representing the textual semantics of bug reports is a key component of bug report management (BRM) techniques. Existing studies mainly use classical information retrieval-based (IR-based) approaches, such as the vector space model (VSM) to do semantic extraction. Little attention is paid to exploring whether word embedding (WE) models from the natural language process could help BRM tasks.

Objective

To have a general view of the potential of word embedding models in representing the semantics of bug reports and attempt to provide some actionable guidelines in using semantic retrieval models for BRM tasks.

Method

We studied the efficacy of five widely recognized WE models for six BRM tasks on 20 widely-used products from the Eclipse and Mozilla foundations. Specifically, we first explored the suitable machine learning techniques under the use of WE models and the suitable WE model for BRM tasks. Then we studied whether WE models performed better than classical VSM. Last, we investigated whether WE models fine-tuned with bug reports outperformed general pre-trained WE models.

Key Results

The Random Forest (RF) classifier outperformed other typical classifiers under the use of different WE models in semantic extraction.We rarely observed statistically significant performance differences among five WE models in five BRM classification tasks, but we found that small-dimensional WE models performed better than larger ones in the duplicate bug report detection task. Among three BRM tasks (i.e., bug severity prediction, reopened bug prediction, and duplicate bug report detection) that showed statistically significant performance differences, VSM outperformed the studied WE models. We did not find performance improvement after we fine-tuned general pre-trained BERT with bug report data.

Conclusion

Performance improvements of using pre-trained WE models were not observed in studied BRM tasks. The combination of RF and traditional VSM was found to achieve the best performance in various BRM tasks.

References

[1]
Aggarwal K, Timbers F, Rutgers T et al (2017) Detecting duplicate bug reports with software engineering domain knowledge. J Softw Evolution Process 29(3):e1821
[2]
Alenezi M, Banitaan S (2013) Bug reports prioritization: Which features and classifier to use? In: 2013 12th International Conference on Machine Learning and Applications, IEEE, pp 112–116
[3]
Anvik J (2006) Automating bug report assignment. In: Proceedings of the 28th International Conference on Software Engineering, pp 937–940
[4]
Ardimento P (2022) Predicting bug-fixing time: Distilbert versus google bert. In: International Conference on Product-Focused Software Process Improvement, Springer, pp 610–620
[5]
Ardimento P, Mele C (2020) Using bert to predict bug-fixing time. In: 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems, IEEE, pp 1–7
[6]
Arokiam J, Bradbury JS (2020) Automatically predicting bug severity early in the development process. In: 2020 IEEE/ACM 42nd International Conference on Software Engineering: New Ideas and Emerging Results, pp 17–20
[7]
Bennin KE, Keung JW, and Monden A On the relative value of data resampling approaches for software defect prediction Empir Softw Eng 2019 24 2 602-636
[8]
Bertram D, Voida A, Greenberg S (2010) Communication, collaboration, and bugs: the social nature of issue tracking in small, collocated teams. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp 291–300
[9]
Bettenburg N, Just S, Schröter A (2008) What makes a good bug report? In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pp 308–318
[10]
Bojanowski P, Grave E, and Joulin A Enriching word vectors with subword information Trans Assoc for Comput Linguist 2017 5 135-146
[11]
Budhiraja A, Dutta K, Reddy R (2018a) Dwen: deep word embedding network for duplicate bug report detection in software repositories. In: Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, pp 193–194
[12]
Budhiraja A, Reddy R, Shrivastava M (2018b) Lwe: Lda refined word embeddings for duplicate bug report detection. In: Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, pp 165–166
[13]
Chawla I, Singh SK (2014) Automatic bug labeling using semantic information from lsi. In: 2014 Seventh International Conference on Contemporary Computing, IEEE, pp 376–381
[14]
Cheng S, Yan X, Khan AA (2020) A similarity integration method based information retrieval and word embedding in bug localization. (2020) IEEE 20th Int Conf Software Qual. Reliability and Security, IEEE, pp 180–187
[15]
Choetkiertikul M, Dam HK, Tran T, et al. Predicting the delay of issues with due dates in software projects Empir Softw Eng 2017 22 3 1223-1263
[16]
Ciborowska A, Damevski K (2022) Fast changeset-based bug localization with bert. In: Proceedings of the 44th International Conference on Software Engineering, pp 946–957
[17]
Demšar J Statistical comparisons of classifiers over multiple data sets J Mach Learn Res 2006 7 1-30
[18]
Deshmukh J, Annervaz K, Podder S et al (2017) Towards accurate duplicate bug retrieval using deep learning techniques. In: 2017 IEEE International Conference on Software Maintenance and Evolution, IEEE, pp 115–124
[19]
Devlin J, Chang MW, Lee K et al (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
[20]
Dunn OJ Multiple comparisons among means J Am Stat Assoc 1961 56 293 52-64
[21]
Giger E, Pinzger M, Gall H (2010) Predicting the fix time of bugs. In: Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering, pp 52–56
[22]
Guo S, Zhang X, Yang X, et al. Developer activity motivated bug triaging: via convolutional neural network Neural Process Lett 2020 51 3 2589-2606
[23]
Habayeb M, Murtaza SS, Miranskyy A, et al. On the use of hidden markov model to predict the time to fix bugs IEEE Trans Softw Eng 2017 44 12 1224-1244
[24]
Han D, Zhang C, Fan X et al (2012) Understanding android fragmentation with topic analysis of vendor-specific bugs. In: 2012 19th Working Conference on Reverse Engineering, IEEE, pp 83–92
[25]
Hewett R and Kijsanayothin P On modeling software defect repair time Empir Softw Eng 2009 14 165-186
[26]
Hindle A, Alipour A, and Stroulia E A contextual approach towards more accurate duplicate bug report detection and ranking Empir Softw Eng 2016 21 2 368-410
[27]
Hinton GE et al. Learning distributed representations of concepts Proc Eighth Ann Conf Cogn Sci Soc Amherst MA 1986 1 1-12
[28]
Huo D, Ding T, McMillan C et al (2014) An empirical study of the effects of expert knowledge on bug reports. In: 2014 IEEE International Conference on Software Maintenance and Evolution, IEEE, pp 1–10
[29]
Islam MS, Hamou-Lhadj A, Sabor KK et al (2021) Enhmm: On the use of ensemble hmms and stack traces to predict the reassignment of bug report fields. In: Proceedings of the 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering, IEEE, pp 411–421
[30]
Izadi M, Akbari K, and Heydarnoori A Predicting the objective and priority of issue reports in software repositories Empir Softw Eng 2022 27 2 1-37
[31]
Jalbert N, Weimer W (2008) Automated duplicate detection for bug tracking systems. In: 2008 IEEE International Conference on Dependable Systems and Networks With FTCS and DCC, IEEE, pp 52–61
[32]
Jia Y, Chen X, Xu S et al (2021) Ekd-bsp: Bug report severity prediction by extracting keywords from description. In: 2021 8th International Conference on Dependable Systems and Their Applications, IEEE, pp 42–53
[33]
Kanwal J and Maqbool O Bug prioritization to facilitate bug report triage J Comput Sci Technol 2012 27 397-412
[34]
Kim D, Tao Y, Kim S, et al. Where should we fix this bug? a two-phase recommendation model IEEE Trans Softw Eng 2013 39 11 1597-1610
[35]
Kim J and Yang G Bug severity prediction algorithm using topic-based feature selection and cnn-lstm algorithm IEEE Access 2022 10 94643-94651
[36]
Kumar L, Kumar M, Murthy LB et al (2021) An empirical study on application of word embedding techniques for prediction of software defect severity level. In: 2021 16th Conference on Computer Science and Intelligence Systems, IEEE, pp 477–484
[37]
Lamkanfi A, Demeyer S (2013) Predicting reassignments of bug reports-an exploratory investigation. In: Proceedings of the 2013 17th European Conference on Software Maintenance and Reengineering, IEEE, pp 327–330
[38]
Lamkanfi A, Demeyer S, Giger E et al (2010) Predicting the severity of a reported bug. In: 2010 7th IEEE Working Conference on Mining Software Repositories), IEEE, pp 1–10
[39]
Lamkanfi A, Demeyer S, Soetens QD et al (2011) Comparing mining algorithms for predicting the severity of a reported bug. In: 2011 15th European Conference on Software Maintenance and Reengineering, IEEE, pp 249–258
[40]
Liu G, Lu Y, Shi K, et al. Mapping bug reports to relevant source code files based on the vector space model and word embedding IEEE Access 2019 7 78870-78881
[41]
Lukins SK, Kraft NA, Etzkorn LH (2008) Source code retrieval for bug localization using latent dirichlet allocation. In: 2008 15Th Working Conference on Reverse Engineering, IEEE, pp 155–164
[42]
Macbeth G, Razumiejczyk E, and Ledesma RD Cliff’s delta calculator: A non-parametric effect size program for two groups of observations Univ Psychol 2011 10 2 545-555
[43]
Marks L, Zou Y, Hassan AE (2011) Studying the fix-time for bugs in large open source projects. In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, pp 1–8
[44]
Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: 2008 IEEE International Conference on Software Maintenance, IEEE, pp 346–355
[45]
Messaoud MB, Miladi A, Jenhani I et al (2022) Duplicate bug report detection using an attention-based neural language model. IEEE Transactions on Reliability
[46]
Mi Q, Keung J, Huo Y, et al. Not all bug reopens are negative: A case study on eclipse bug reports Inf Softw Technol 2018 99 93-97
[47]
Mikolov T, Chen K, Corrado G et al (2013a) Efficient estimation of word representations in vector space. arXiv:1301.3781
[48]
Mikolov T, Sutskever I, Chen K et al (2013b) Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26
[49]
Nguyen AT, Nguyen TT, Nguyen TN et al (2012) Duplicate bug report detection with a combination of information retrieval and topic modeling. In: 2012 Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering, IEEE, pp 70–79
[50]
Panjer LD (2007) Predicting eclipse bug lifetimes. In: Fourth international workshop on mining software repositories (MSR’07: ICSE workshops 2007), IEEE, pp 29–29
[51]
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1532–1543
[52]
Peters ME, Neumann M, Iyyer M et al (2018) Deep contextualized word representations. arXiv:1802.05365
[53]
Poshyvanyk D, Marcus A, Rajlich V et al (2006) Combining probabilistic ranking and latent semantic indexing for feature identification. In: 14th IEEE International Conference on Program Comprehension, IEEE, pp 137–148
[54]
Poshyvanyk D, Gueheneuc YG, Marcus A, et al. Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval IEEE Trans Softw Eng 2007 33 6 420-432
[55]
Probst P and Boulesteix AL To tune or not to tune the number of trees in random forest J Mach Learn Res 2018 18 181 1-18
[56]
Probst P, Boulesteix AL, and Bischl B Tunability: Importance of hyperparameters of machine learning algorithms J Mach Learn Res 2019 20 53 1-32
[57]
Probst P, Wright MN, Boulesteix AL (2019) Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev data mining knowl disc 9(3):e1301
[58]
Ramay WY, Umer Q, Yin XC, et al. Deep neural network-based severity prediction of bug reports IEEE Access 2019 7 46846-46857
[59]
Rodrigues IM, Aloise D, Fernandes ER et al (2020) A soft alignment model for bug deduplication. In: Proceedings of the 17th International Conference on Mining Software Repositories, pp 43–53
[60]
Roy NKS, Rossi B (2014) Towards an improvement of bug severity classification. In: 2014 40th EUROMICRO Conference on Software Engineering and Advanced Applications, IEEE, pp 269–276
[61]
Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: 29th International Conference on Software Engineering (ICSE’07), IEEE, pp 499–510
[62]
Saha RK, Lease M, Khurshid S et al (2013) Improving bug localization using structured information retrieval. In: 2013 28th IEEE/ACM International Conference on Automated Software Engineering, IEEE, pp 345–355
[63]
Sahin SE, Tosun A (2019) A conceptual replication on predicting the severity of software vulnerabilities. In: Proceedings of the 23rd International Conference on Evaluation and Assessment in Software Engineering, pp 244–250
[64]
Sepahvand R, Akbari R, and Hashemi S Predicting the bug fixing time using word embedding and deep long short term memories Inst Eng Technol Softw 2020 14 3 203-212
[65]
Shihab E, Ihara A, Kamei Y et al (2010) Predicting re-opened bugs: A case study on the eclipse project. In: 2010 17th Working Conference on Reverse Engineering, IEEE, pp 249–258
[66]
Shihab E, Ihara A, Kamei Y, et al. Studying re-opened bugs in open source software Empir Softw Eng 2013 18 5 1005-1042
[67]
Sun C, Lo D, Wang X et al (2010) A discriminative model approach for accurate duplicate bug report retrieval. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 1, pp 45–54
[68]
Sun C, Lo D, Khoo SC et al (2011) Towards more accurate retrieval of duplicate bug reports. In: 2011 26th IEEE/ACM International Conference on Automated Software Engineering, IEEE, pp 253–262
[69]
Tagra A, Zhang H, Rajbahadur GK, et al. Revisiting reopened bugs in open source software systems Empir Softw Eng 2022 27 4 92
[70]
Tantithamthavorn C, Abebe SL, Hassan AE, et al. The impact of ir-based classifier configuration on the performance and the effort of method-level bug localization Inf Softw Technol 2018 102 160-174
[71]
Tian Y, Sun C, Lo D (2012) Improved duplicate bug report identification. In: 2012 16th European Conference on Software Maintenance and Reengineering, IEEE, pp 385–390
[72]
Tian Y, Lo D, Sun C (2013) Drone: Predicting priority of reported bugs by multi-factor analysis. In: 2013 IEEE International Conference on Software Maintenance, IEEE, pp 200–209
[73]
Tian Y, Lo D, Xia X, et al. Automated prediction of bug report priority using multi-factor analysis Empir Softw Eng 2015 20 1354-1383
[74]
Umer Q, Liu H, and Illahi I Cnn-based automatic prioritization of bug reports IEEE Trans Reliab 2019 69 4 1341-1354
[75]
Van Nguyen T, Nguyen AT, Phan HD et al (2017) Combining word2vec with revised vector space model for better code retrieval. In: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion, IEEE, pp 183–185
[76]
Vieira RG, Mattos CLC, Rocha LS, et al. The role of bug report evolution in reliable fixing estimation Empir Softw Eng 2022 27 7 164
[77]
Wang S, Lo D, Lawall J (2014) Compositional vector space models for improved bug localization. In: 2014 IEEE International Conference on Software Maintenance and Evolution, IEEE, pp 171–180
[78]
Wang X, Zhang L, Xie T et al (2008) An approach to detecting duplicate bug reports using natural language and execution information. In: Proceedings of the 30th international conference on Software engineering, pp 461–470
[79]
Weiss C, Premraj R, Zimmermann T et al (2007) How long will it take to fix this bug? In: Fourth International Workshop on Mining Software Repositories (MSR’07: ICSE Workshops 2007), IEEE, pp 1–1
[80]
Xia X, Lo D, Wang X et al (2013) A comparative study of supervised learning algorithms for re-opened bug prediction. In: 2013 17th European conference on software maintenance and reengineering, IEEE, pp 331–334
[81]
Xia X, Lo D, Wen M et al (2014) An empirical study of bug report field reassignment. In: 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering, and Reverse Engineering (CSMR-WCRE), IEEE, pp 174–183
[82]
Xia X, Lo D, Shihab E, et al. Automated bug report field reassignment and refinement prediction IEEE Trans Reliab 2015 65 3 1094-1113
[83]
Xia X, Lo D, Shihab E, et al. Automatic, high accuracy prediction of reopened bugs Autom Softw Eng 2015 22 75-109
[84]
Xiao Y, Keung J, Mi Q et al (2017) Improving bug localization with an enhanced convolutional neural network. In: 2017 24th Asia-Pacific Software Engineering Conference, IEEE, pp 338–347
[85]
Xiao Y, Keung J, Bennin KE, et al. Improving bug localization with word embedding and enhanced convolutional neural networks Inf Softw Technol 2019 105 17-29
[86]
Yang A, Wang Q, Liu J et al (2019) Enhancing pre-trained language representations with rich knowledge for machine reading comprehension. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp 2346–2357
[87]
Yang G, Zhang T, Lee B (2014) Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: 2014 IEEE 38th Annual Computer Software and Applications Conference, IEEE, pp 97–106
[88]
Yang X, Lo D, Xia X et al (2016) Combining word embedding with information retrieval to recommend similar bug reports. In: 2016 IEEE 27Th International Symposium on Software Reliability Engineering, IEEE, pp 127–137
[89]
Ye X, Bunescu R, Liu C (2014) Learning to rank relevant files for bug reports using domain knowledge. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp 689–699
[90]
Ye X, Bunescu R, and Liu C Mapping bug reports to relevant files: A ranking model, a fine-grained benchmark, and feature evaluation IEEE Trans Softw Eng 2015 42 4 379-402
[91]
Yuan W, Xiong Y, Sun H et al (2021) Incorporating multiple features to predict bug fixing time with neural networks. In: 2021 IEEE international conference on software maintenance and evolution (ICSME), IEEE, pp 93–103
[92]
Zhang H, Gong L, Versteeg S (2013) Predicting bug-fixing time: an empirical study of commercial software projects. In: 2013 35th International Conference on Software Engineering (ICSE), IEEE, pp 1042–1051
[93]
Zhang T, Chen J, Yang G, et al. Towards more accurate severity prediction and fixer recommendation of software bugs J Syst Softw 2016 117 166-184
[94]
Zhang T, Yu Y, Mao X, et al. Fense: A feature-based ensemble modeling approach to cross-project just-in-time defect prediction Empir Softw Eng 2022 27 7 1-41
[95]
Zhang T, Han D, Vinayakarao V, et al. Duplicate bug report detection: How far are we? ACM Trans Softw Eng Methodol 2023 32 4 1-32
[96]
Zhang W, Challis C (2020) Automatic bug priority prediction using dnn based regression. In: Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery: Volume 1, Springer, pp 333–340
[97]
Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports. In: 2012 34th International Conference on Software Engineering, IEEE, pp 14–24
[98]
Zhou X, Zhang Y, Cui L, et al. Evaluating commonsense in pre-trained language models Proc AAAI Conf Artif Intell 2020 34 9733-9740
[99]
Zimmermann T, Nagappan N, Guo PJ et al (2012) Characterizing and predicting which bugs get reopened. In: 2012 34th International Conference on Software Engineering, IEEE, pp 1074–1083
[100]
Zou W, Lo D, Chen Z, et al. How practitioners perceive automated bug report management techniques IEEE Trans Softw Eng 2018 46 8 836-862

Index Terms

  1. An empirical study on the potential of word embedding techniques in bug report management tasks
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Empirical Software Engineering
    Empirical Software Engineering  Volume 29, Issue 5
    Sep 2024
    1352 pages

    Publisher

    Kluwer Academic Publishers

    United States

    Publication History

    Published: 25 July 2024
    Accepted: 30 May 2024

    Author Tags

    1. Bug report
    2. Word embedding
    3. Pre-trained models
    4. Vector space model

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Dec 2024

    Other Metrics

    Citations

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media