Abstract
Fault localization aims to efficiently locate faults when debugging programs, reducing software development and maintenance costs. Spectrum-based fault location (SBFL) is the most commonly used fault location technology, which calculates and ranks the suspicious value of each program entity with a specific formula by counting the coverage information of all the program entities and execution results of test cases. However, previous SBFL techniques suffered from low accuracy due to the sole use of execution coverage. This paper proposed an approach GNet4FL based on the graph convolutional neural network. GNet4FL first collects static features based on code structure and dynamic features based on test results. Then, GNet4FL uses GraphSAGE to obtain node representation of source codes and performs feature fusion on an entity consisting of multiple nodes, which preserves the topological information of the graph. Finally, the representation of each entity is input to the multi-layer perceptron for training and ranking entities. The results of the study showed that GNet4FL successfully located 160 out of 262 faults, outperforming the three state-of-the-art methods by 94, 42, and 14% in Top-1 accuracy, and having close results to Grace with less cost. Furthermore, we investigated the impact of each component (i.e., graph neural network, pruning, and dynamic features) of GNet4FL on the results. We found that all of these components had a positive impact on the proposed approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abreu, R., Zoeteweij, P., Van Gemund, A.J.: An evaluation of similarity coefficients for software fault localization. In: 2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC’06), pp. 39–46. IEEE (2006)
Abreu, R., Zoeteweij, P., Golsteijn, R., Van Gemund, A.J.: A practical evaluation of spectrum-based fault localization. J. Syst. Softw. 82(11), 1780–1792 (2009)
Alon, U., Zilberstein, M., Levy, O., Yahav, E.: code2vec: learning distributed representations of code. In: Proceedings of the ACM on Programming Languages 3(POPL), pp. 1–29 (2019)
Anowar, F., Sadaoui, S., Selim, B.: Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne). Comput. Sci. Rev. 40, 100378 (2021)
Bruna, J., Zaremba, W., Szlam, A., LeCun, Y.: Spectral networks and locally connected networks on graphs (2013). arXiv preprint arXiv:1312.6203
Chung, H.M., Gey, F., Piramuthu, S.: Data mining and information retrieval. In: Proceedings of the 35th Annual Hawaii International Conference on System Sciences, vol. 7, pp. 841–842. IEEE Computer Society (2002)
Dutta, A., Godboley, S.: Msfl: a model for fault localization using mutation-spectra technique. In: International Conference on Lean and Agile Software Development, pp. 156–173 (2021). Springer
Feyzi, F.: CGT-FL: using cooperative game theory to effective fault localization in presence of coincidental correctness. Empir. Softw. Eng. 25(5), 3873–3927 (2020)
Fushiki, T.: Estimation of prediction error by using k-fold cross-validation. Stat. Comput. 21(2), 137–146 (2011)
Gong, P., Zhao, R., Li, Z.: Faster mutation-based fault localization with a novel mutation execution strategy. In: 2015 IEEE Eighth International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pp. 1–10. IEEE (2015)
Gu, W., Tandon, A., Ahn, Y.-Y., Radicchi, F.: Principled approach to the selection of the embedding dimension of networks. Nat. Commun. 12(1), 1–10 (2021)
Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 1025–1035 (2017)
Jones, J.A., Harrold, M.J.: Empirical evaluation of the tarantula automatic fault-localization technique. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, pp. 273–282 (2005)
Ju, X., Jiang, S., Chen, X., Wang, X., Zhang, Y., Cao, H.: HSFal: effective fault localization using hybrid spectrum of full slices and execution slices. J. Syst. Softw. 90, 3–17 (2014)
Keller, F., Grunske, L., Heiden, S., Filieri, A., van Hoorn, A., Lo, D.: A critical evaluation of spectrum-based fault localization techniques on a large-scale software system. In: 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 114–125. IEEE (2017)
Kipf, T.N., Welling, M.: Semi-supervised Learning with Graph Convolutional Networks. ICLR (2017)
Küçük, Y., Henderson, T.A., Podgurski, A.: Improving fault localization by integrating value and predicate based causal inference techniques. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 649–660. IEEE (2021)
Lam, A.N., Nguyen, A.T., Nguyen, H.A., Nguyen, T.N.: Bug localization with combination of deep learning and information retrieval. In: 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC), pp. 218–229 (2017). IEEE
Li, X., Zhang, L.: Transforming programs and tests in tandem for fault localization. In: Proceedings of the ACM on Programming Languages 1(OOPSLA), pp. 1–30 (2017)
Li, X., Li, W., Zhang, Y., Zhang, L.: DeepFL: integrating multiple fault diagnosis dimensions for deep fault localization. In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 169–180 (2019)
Li, Y., Wang, S., Nguyen, T.N.: Fault localization with code coverage representation learning. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 661–673. IEEE (2021)
Liu, C., Yan, X., Fei, L., Han, J., Midkiff, S.P.: SOBER: statistical model-based bug localization. SIGSOFT Softw. Eng. Notes 30(5), 286–295 (2005). https://doi.org/10.1145/1095430.1081753
Liu, Y., Li, Z., Wang, L., Hu, Z., Zhao, R.: Statement-oriented mutant reduction strategy for mutation based fault localization. In: 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 126–137. IEEE (2017)
Lou, Y., Zhu, Q., Dong, J., Li, X., Sun, Z., Hao, D., Zhang, L., Zhang, L.: Boosting coverage-based fault localization via graph-based representation learning. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 664–676 (2021)
Mao, X., Lei, Y., Dai, Z., Qi, Y., Wang, C.: Slice-based statistical fault localization. J. Syst. Softw. 89, 51–62 (2014)
Naish, L., Lee, H.J., Ramamohanarao, K.: A model for spectra-based software diagnosis. ACM Trans. Softw. Eng. Methodol. (TOSEM) 20(3), 1–32 (2011)
Papadakis, M., Le Traon, Y.: Metallaxis-FL: mutation-based fault localization. Softw. Test. Verif. Reliab. 25(5–7), 605–628 (2015)
Park, S., Vuduc, R.W., Harrold, M.J.: Falcon: fault localization in concurrent programs. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering, vol. 1, pp. 245–254 (2010)
Pearson, S., Campos, J., Just, R., Fraser, G., Abreu, R., Ernst, M.D., Pang, D., Keller, B.: Evaluating and improving fault localization. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 609–620 (2017). IEEE
Peng, Z., Xiao, X., Hu, G., Sangaiah, A.K., Atiquzzaman, M., Xia, S.: Abfl: an autoencoder based practical approach for software fault localization. Inf. Sci. 510, 108–121 (2020)
Planning, S.: The economic impacts of inadequate infrastructure for software testing. National Institute of Standards and Technology (2002)
Qian, J., Ju, X., Chen, X., Shen, H., Shen, Y.: AGFL: a graph convolutional neural network-based method for fault localization. In: 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS), pp. 672–680 (2021). IEEE
Raunak, V.: Simple and effective dimensionality reduction for word embeddings (2017). arXiv preprint arXiv:1708.03629
Sohn, J., Yoo, S.: Fluccs: Using code and change metrics to improve fault localization. In: Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 273–283 (2017)
Soremekun, E., Kirschner, L., Böhme, M., Zeller, A.: Locating faults with program slicing: an empirical analysis. Empir. Softw. Eng. 26(3), 1–45 (2021)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Vancsics, B., Horváth, F., Szatmári, A., Beszédes, Á.: Fault localization using function call frequencies. J. Syst. Softw. 193, 111429 (2022)
Wang, X., Ji, H., Shi, C., Wang, B., Ye, Y., Cui, P., Yu, P.S.: Heterogeneous graph attention network. In: The World Wide Web Conference, pp. 2022–2032 (2019)
Wong, W.E., Qi, Y.: BP neural network-based effective fault localization. Int. J. Softw. Eng. Knowl. Eng. 19(04), 573–597 (2009)
Wong, W.E., Debroy, V., Choi, B.: A family of code coverage-based heuristics for effective fault localization. J. Syst. Softw. 83(2), 188–208 (2010). https://doi.org/10.1016/j.jss.2009.09.037
Wong, W.E., Debroy, V., Gao, R., Li, Y.: The DStar method for effective software fault localization. IEEE Trans. Reliab. 63(1), 290–308 (2013)
Wong, W.E., Gao, R., Li, Y., Abreu, R., Wotawa, F.: A survey on software fault localization. IEEE Trans. Softw. Eng. 42(8), 707–740 (2016)
Woolson, R.F.: Wilcoxon signed-rank test. In: Wiley Encyclopedia of Clinical Trials, pp. 1–3 (2007)
Wu, F., Souza, A., Zhang, T., Fifty, C., Yu, T., Weinberger, K.: Simplifying graph convolutional networks. In: International Conference on Machine Learning, pp. 6861–6871. PMLR (2019)
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4–24 (2020)
Xiao, X., Pan, Y., Zhang, B., Hu, G., Li, Q., Lu, R.: ALBFL: a novel neural ranking model for software fault localization via combining static and dynamic features. Inf. Softw. Technol. 139, 106653 (2021)
Xie, X., Chen, T.Y., Kuo, F.-C., Xu, B.: A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans. Softw. Eng. Methodol. (TOSEM) 22(4), 1–40 (2013a)
Xie, X., Wong, W.E., Chen, T.Y., Xu, B.: Metamorphic slice: an application in spectrum-based fault localization. Inf. Softw. Technol. 55(5), 866–879 (2013b)
Xu, J., Wang, F., Ai, J.: Defect prediction with semantics and context features of codes based on graph representation learning. IEEE Trans. Reliab. 70(2), 613–625 (2020)
Xuan, J., Jiang, H., Hu, Y., Ren, Z., Zou, W., Luo, Z., Wu, X.: Towards effective bug triage with software data reduction techniques. IEEE Trans. Knowl. Data Eng. 27(1), 264–280 (2014)
Yang, Y., Zhou, Y., Liu, J., Zhao, Y., Lu, H., Xu, L., Xu, B., Leung, H.: Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 157–168 (2016)
Yang, Y., Xia, X., Lo, D., Grundy, J.: A survey on deep learning for software engineering. ACM Comput. Surv. (CSUR) (2021). https://doi.org/10.1145/3505243
Zhang, J., Wang, X., Zhang, H., Sun, H., Wang, K., Liu, X.: A novel neural source code representation based on abstract syntax tree. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 783–794. IEEE (2019a)
Zhang, M., Li, Y., Li, X., Chen, L., Zhang, Y., Zhang, L., Khurshid, S.: An empirical study of boosting spectrum-based fault localization via pagerank. IEEE Trans. Softw. Eng. 47(6), 1089–1113 (2019b)
Zhang, J.M., Harman, M., Ma, L., Liu, Y.: Machine learning testing: survey, landscapes and horizons. IEEE Trans. Softw. Eng. 48, 1–36 (2020)
Zhang, Z., Lei, Y., Mao, X., Yan, M., Xu, L., Zhang, X.: A study of effectiveness of deep learning in locating real faults. Inf. Softw. Technol. 131, 106486 (2021)
Zhou, J., Zhang, H., Lo, D.: Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports. In: 2012 34th International Conference on Software Engineering (ICSE), pp. 14–24 (2012). IEEE
Zou, D., Liang, J., Xiong, Y., Ernst, M.D., Zhang, L.: An empirical study of fault localization families and their combinations. IEEE Trans. Softw. Eng. 47(2), 332–347 (2019)
Author information
Authors and Affiliations
Contributions
JQ and XJ wrote the main manuscript text and figures, and XC helped to improve the approach and the framework. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qian, J., Ju, X. & Chen, X. GNet4FL: effective fault localization via graph convolutional neural network. Autom Softw Eng 30, 16 (2023). https://doi.org/10.1007/s10515-023-00383-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10515-023-00383-z