Abstract
In this article we emphasize that most of the faults, appearing in real-world programs, are complicated and there exists a high interaction between faulty and other correlated statements, that is likely to cause coincidental correctness in many cases. To effectively diminish the negative impact of coincidentally correct tests on localization effectiveness, we suggest analyzing the combinatorial effect of program statements on the failure. To this end, we develop a new framework, CGT-FL, for evaluation and ranking program statements in a manner that statements which have strong discriminatory power as a group but are weak as individuals could be identified. The framework firstly evaluates the interactivity degree of each statement according to its influence on the intricate interrelation among statements by a Shapley value-based cooperative game-theoretic method. Then, statements are selected in a forward way by considering both interactivity and relevance measures. To verify the effectiveness of CGT-FL, we provide the results of our extensive experiments with different subject programs, containing seeded and real faults. The experimental results are then compared with those provided by different fault localization techniques for both single-fault and multiple-fault programs. The results prove the outperformance of CGT-FL compared to state-of-the-art techniques.
Similar content being viewed by others
References
Abreu R, Zoeteweij P, Golsteijn R, Van Gemund AJ (2009) A practical evaluation of spectrum-based fault localization. J Syst Softw 82(11):1780–1792
Androutsopoulos K, Clark D, Dan H, Hierons RM, Harman M (2014) An analysis of the relationship between conditional entropy and failed error propagation in software testing. In Proceedings of the 36th International Conference on Software Engineering. ACM, pp. 573–583
Baah GK, Podgurski A, Harrold MJ (2010) The probabilistic program dependence graph and its application to fault diagnosis. IEEE Trans Softw Eng 36(4):528–545
Burbea J, Rao C (1982) On the convexity of some divergence measures based on entropy functions. IEEE Trans Inf Theory 28(3):489–495
Chekam TT, Papadakis M, Traon YL (2016) Assessing and comparing mutation-based fault localization techniques. arXiv preprint arXiv:1607.05512
Chen J, Li Q, Zhao J, Li X (2010) Test adequacy criterion based on coincidental correctness probability. In Proceedings of the Second Asia-Pacific Symposium on Internetware. ACM, p. 20
Cohen S, Dror G, Ruppin E (2007) Feature selection via coalitional game theory. Neural Comput 19(7):1939–1961
Cover TM, Thomas JA (2012) Elements of information theory. Wiley, Hoboken
Deng X, Papadimitriou CH (1994) On the complexity of cooperative solution concepts. Math Oper Res 19(2):257–266
W. Dickinson, D. Leon, A. Podgurski (2001) Finding failures by cluster analysis of execution profiles. ICSE, pp. 339–348
DiGiuseppe N, Jones J (2012) Software behavior and failure clustering: An empirical study of fault causality. Proceedings of 5th International Conference on Software Testing, Verification and Validation (ICST), IEEE, pp. 191–200
Feyzi F, Parsa S (2017a) FPA-FL: incorporating static fault-proneness analysis into statistical fault localization. J Syst Softw. https://doi.org/10.1016/j.jss.2017.11.002
Feyzi F, Parsa S (2017b) Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference. Front Comput Sci. https://doi.org/10.1007/s11704-017-6512-z
Feyzi F, Parsa S (2018a) Kernel-based detection of coincidentally correct test cases to improve fault localization effectiveness. Int J Appl Pattern Recog 5(2):119–136
Feyzi F, Parsa S (2018b) A program slicing-based method for effective detection of coincidentally correct test cases. Computing 100(9):927–969
Feyzi F, Nikravan E, Parsa S (2016) FPA-Debug: Effective Statistical Fault Localization Considering Fault-proneness Analysis. arXiv preprint arXiv:1612.05780
Gupta N, He H, Zhang X, Gupta R (2005) Locating faulty code using failure-inducing chops. In Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering. ACM, pp. 263–272
Hierons RM (2006) Avoiding coincidental correctness in boundary value analysis. ACM Trans Softw Eng Methodol (TOSEM) 15(3):227–241
Hong S, Lee B, Kwak T, Jeon Y, Ko B, Kim Y, Kim M (2015) Mutation-based fault localization for real-world multilingual programs (T). In Automated Software Engineering (ASE), 2015 30th IEEE/ACM international conference on. IEEE, pp. 464–475
Jeffrey D, Gupta N, Gupta R (2008) Fault localization using value replacement. In Proceedings of the 2008 International Symposium on Software Testing and Analysis. ACM, pp. 167–178
Jones, J. A., & Harrold, M. J. (2005). Empirical evaluation of the tarantula automatic fault-localization technique. In Proceedings of the 20th IEEE/ACM international conference on automated software engineering. ACM (pp. 273–282)
Ju X, Jiang S, Chen X, Wang X, Zhang Y, Cao H (2014) HSFal: effective fault localization using hybrid spectrum of full slices and execution slices. J Syst Softw 90:3–17
Just R, Jalali D, Ernst MD (2014) Defects4J: a database of existing faults to enable controlled testing studies for Java programs. Proceedings of the 2014 International Symposium on Software Testing and Analysis, ACM, 437–440
Kochhar PS, Xia X, Lo D, Li S (2016) Practitioners’ expectations on automated fault localization. In proceedings of the 25th International Symposium on Software Testing and Analysis, pp. 165–176
Le TDB, Thung F, Lo D (2013). Theory and practice, do they match? A case with spectrum-based fault localization. In Software Maintenance (ICSM), 2013 29th IEEE International Conference on. IEEE, pp. 380–383
Le TDB, Lo D, Li M (2015) Constrained feature selection for localizing faults. In software maintenance and evolution (ICSME), 2015 IEEE International Conference on. IEEE, pp. 501–505
T-DB Le, Lo, D., Le Goues, C., & Grunske, L. (2016) A learning-to-rank based fault localization approach using likely invariants. In Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, pp. 177–188
Li N, Li F, Offutt J (2012) Better algorithms to minimize the cost of test paths. Proceedings of 5th International Conference on Software Testing, Verification and Validation, IEEE, pp. 280–289
Lo D, Jiang L, Budi A (2010) Comprehensive evaluation of association measures for fault localization. In Software Maintenance (ICSM), 2010 IEEE International Conference on. IEEE, pp. 1–10
Masri, W., & Assi, R. A. (2010a, April). Cleansing test suites from coincidental correctness to enhance fault-localization. In 2010 third international conference on software testing, verification and validation (pp. 165-174). IEEE
Masri W, Assi RA (2010b). Cleansing test suites from coincidental correctness to enhance fault-localization. In 2010 Third International Conference on Software Testing, Verification and Validation. IEEE, pp. 165-174
Masri W, Assi RA (2014) Prevalence of coincidental correctness and mitigation of its impact on fault localization. ACM Trans Softw Eng Methodol (TOSEM) 23(1):8
Masri W, Podgurski A (2009) Measuring the strength of information flows in programs. ACM Trans Softw Eng Methodol (TOSEM) 19(2):5
Masri, W., Abou-Assi, R., El-Ghali, M., & Al-Fatairi, N. (2009). An empirical study of the factors that reduce the effectiveness of coverage-based fault localization. In Proceedings of the 2nd International Workshop on Defects in Large Software Systems: held in conjunction with the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2009) (pp. 1–5). ACM
Masri W, Assi RA, Zaraket F, Fatairi N (2012) Enhancing fault localization via multivariate visualization. In Software Testing, Verification and Validation (ICST), 2012 IEEE Fifth International Conference on. IEEE, pp. 737–741
Meyer PE, Schretter C, Bontempi G (2008) Information-theoretic feature selection in microarray data using variable complementarity. IEEE J Sel Top Sig Process 2(3):261–274
Miao, Y., Chen, Z., Li, S., Zhao, Z., & Zhou, Y. (2012, June). Identifying coincidental correctness for fault localization by clustering test cases. In SEKE (pp. 267-272)
Miao Y, Chen Z, Li S, Zhao Z, Zhou Y (2013) A clustering-based strategy to identify coincidental correctness in fault localization. Int J Softw Eng Knowl Eng 23(05):721–741
Naish L, Lee HJ, Ramamohanarao K (2011) A model for spectra-based software diagnosis. ACM Trans Softw Eng Methodol (TOSEM) 20(3):11
Ott RL (1993) An introduction to statistical methods and data analysis, 4th edn. Duxbury Press, North Scituate
Papadakis M, Le Traon Y (2015) Metallaxis-FL: mutation-based fault localization. Softw Test Verification Reliab 25(5–7):605–628
Pearson S, Campos J, Just R, Fraser G, Abreu R, Ernst MD, ... Keller B (2017) Evaluating and improving fault localization. In Software Engineering (ICSE), 2017 IEEE/ACM 39th International Conference on. IEEE, pp. 609–620
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Perez A, Abreu R, d’Amorim M (2017) Prevalence of single-fault fixes and its impact on fault localization. In 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST). IEEE, pp. 12–22
Roychowdhury S, Khurshid S (2011a) A novel framework for locating software faults using latent divergences. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, pp. 49–64
Roychowdhury S, Khurshid S (2011b) Software fault localization using feature selection. In Proceedings of the International Workshop on Machine Learning Technologies in Software Engineering. ACM, pp. 11–18
Roychowdhury S, Khurshid S (2012) A family of generalized entropies and its application to software fault localization. In Intelligent Systems (IS), 2012 6th IEEE International Conference. IEEE, pp. 368-373
Shapley LS (1953) A value for n-person games. Contrib Theory Games 2(28):307–317
Sun X, Liu Y, Li J, Zhu J, Liu X, Chen H (2012) Using cooperative game theory to optimize the feature selection problem. Neurocomputing 97:86–93
Valizadegan H, Tan PN (2007) Kernel based detection of mislabeled training examples. In Proceedings of the 2007 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp. 309–319
Voas JM (1992) PIE: a dynamic failure-based technique. IEEE Trans Softw Eng 18(8):717–727
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco
Wong WE, Debroy V, Choi B (2010) A family of code coverage-based heuristics for effective fault localization. J Syst Softw 83(2):188–208
Wong WE, Debroy V, Xu D (2012) Towards better fault localization: a crosstab-based statistical approach. IEEE Trans Syst Man Cybern Part C Appl Rev 42(3):378–396
Wong WE, Debroy V, Gao R, Li Y (2014) The dstar method for effective software fault localization. IEEE Trans Reliab 63(1):290–308
Xie X, Kuo FC, Chen TY, Yoo S, Harman M (2013) Provably optimal and human-competitive results in sbse for spectrum based fault localisation. In international symposium on search based software engineering. Springer, Berlin pp. 224–238
Xu J, Zhang Z, Chan WK, Tse TH, Li S (2013) A general noise-reduction framework for fault localization of Java programs. Inf Softw Technol 55(5):880–896
Xue, X., Pang, Y., & Namin, A. S. (2014). Trimming test suites with coincidentally correct test cases for enhancing fault localizations. In computer software and applications conference (COMPSAC), 2014 IEEE 38th Annual. IEEE, pp. 239–244
Yan S, Chen Z, Zhao Z, Zhang C, Zhou Y (2010) A dynamic test cluster sampling strategy by leveraging execution spectra information. ICST, pp. 147–154
Yoo S (2012) Evolving human competitive spectra-based fault localisation techniques. In International Symposium on Search Based Software Engineering. Springer, Berlin, pp. 244–258
Yoo S, Xie X, Kuo FC, Chen TY, Harman M (2014) No pot of gold at the end of program spectrum rainbow: greatest risk evaluation formula does not exist. RN 14(14):14
Zeller A, Hildebrandt R (2002) Simplifying and isolating failure-inducing input. IEEE Trans Softw Eng 28(2):183–200
Zhang X, Gupta R (2004) Whole execution traces. In 37th International Symposium on Microarchitecture (MICRO-37′04). IEEE, pp. 105–116
Zhang X, Gupta N, Gupta R (2006a). Pruning dynamic slices with confidence. In ACM SIGPLAN Notices 41(6): 169–180. ACM
Zhang X, Gupta N, Gupta R (2006b). Locating faults through automated predicate switching. In Proceedings of the 28th International Conference on Software Engineering. ACM, pp. 272–281
Zhang X, Tallam S, Gupta N, Gupta R (2007) Towards locating execution omission errors. In ACM Sigplan Notices 42(6): 415–424. ACM
Zhang M, Li X, Zhang L, Khurshid S (2017) Boosting spectrum-based fault localization using PageRank. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, pp. 261–272
Author information
Authors and Affiliations
Corresponding author
Additional information
Guest Editor: Shin Yoo
Rights and permissions
About this article
Cite this article
Feyzi, F. CGT-FL: using cooperative game theory to effective fault localization in presence of coincidental correctness. Empir Software Eng 25, 3873–3927 (2020). https://doi.org/10.1007/s10664-020-09859-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-020-09859-y