Abstract
Concurrent software testing is a challenging activity due to factors that are not present in sequential programs, such as communication, synchronization, and non-determinism, and that directly affect the testing process. When we consider multithreaded programs, new challenges for the testing activity are imposed. In the context of structural testing, an important problem raised is how to deal with the coverage of shared variables in order to establish the association between def-use of shared variables. This paper presents results related to the structural testing of multithreaded programs, including testing criteria for coverage testing, a supporting tool, called ValiPthread testing tool and results of an experimental study. This study was conducted to evaluate the cost, effectiveness, and strength of the testing criteria. Also, the study evaluates the contribution of these testing criteria to test specific aspects of multithreaded programs. The experimental results show evidence that the testing criteria present lower cost and higher effectiveness when revealing some kinds of defects, such as deadlock and critical region block. Also, compared to sequential testing criteria, the proposed criteria show that it is important to establish specific coverage testing for multithreaded programs.
Similar content being viewed by others
References
Badlaney, J., Ghatol, R., & Jadhwani, R. (2006). An introduction to data-flow testing. Tech. Rep. 22, North Carolina State University, Raleigh.
Basili, V.R. (1996). The role of experimentation in software engineering: past, current, and future. In ICSE (pp. 442–449).
Bradbury, J.S., & Jalbert, K. (2009). Defining a catalog of programming anti-patterns for concurrent java. In Proceedings of SPAQu’09 (pp. 6–11).
Brito, M.A.S., do Rocio Senger de Souza, S., & de Souza, P.S.L. (2013). An empirical evaluation of the cost and effectiveness of structural testing criteria for concurrent programs. In ICCS (Vol. 18, pp. 250–259). Elsevier, Procedia.
Brito, M.A.S., Santos, M., Souza, P.S.L., & Souza, S.R.S. (2015). Integration testing criteria for mobile robotic systems. In The 27th international conference on software engineering and knowledge engineering, SEKE 2015, July 6–8, 2015 (pp. 182–187). Pittsburgh: Wyndham Pittsburgh University Center.
Carver, R.H., & Lei, Y. (2010). Distributed reachability testing of concurrent programs. Concurrency and Computation: Practice and Experience, 22(18), 2445–2466.
Carver, R.H., & Tai, K.C. (1991). Replay and testing for concurrent programs. IEEE Software, 8(2), 66–74.
Chung, C.M., Shih, T.K., Wang, Y.H., Lin, W.C., & Kou, Y.F. (1996). Task decomposition testing and metrics for concurrent programs. In Fifth international symposium on software reliability engineering (pp. 122–130).
Cordeiro, L., & Fischer, B. (2011). Verifying multi-threaded software using smt-based context-bounded model checking. In 33rd international conference on software engineering, ICSE (pp. 331–340). New York: ACM.
da Costa Araújo, I, da Silva, W.O., de Sousa Nunes, J.B., & Neto, F.O. (2016). Arrestt: a framework to create reproducible experiments to evaluate software testing techniques. In Proceedings of the 1st Brazilian symposium on systematic and automated software testing, SAST (pp. 1:1–1:10). New York: ACM. doi:10.1145/2993288.2993303.
Damodaran-Kamal, S.K., & Francioni, J.M. (1993). Nondeterminacy: testing and debugging in message passing parallel programs. In 3rd ACM/ONR workshop on parallel and distributed debugging (pp. 118–128). ACM.
de Oliveira Neto, F.G., Torkar, R., & Machado, P.D.L. (2015). An initiative to improve reproducibility and empirical evaluation of software testing techniques. In Proceedings of the 37th international conference on software engineering - Volume 2 ICSE’15 (pp. 575–578). Piscataway: IEEE Press.
Denaro, G., Pezzè, M, & Vivanti, M. (2013). Quantifying the complexity of dataflow testing. In 8th international workshop on automation of software test, AST 2013, May 18–19 (pp. 132–138). San Francisco.
Dourado, G.G.M., de Souza, P.S.L., Prado, R.R., Batista, R.N., Souza, S.R.S., Estrella, J.C., Bruschi, S.M., & Lourenço, J (2016). A suite of Java message-passing benchmarks to support the validation of testing models, criteria and tools. In International conference on computational science 2016, ICCS 2016, 6–8 June, 2016, (pp. 2226–2230). San Diego, California.
Edelstein, O., Farchi, E., Golden, E., Nir, Y., Ratsaby, G., & Ur, S. (2002). Contest: a users perspective. In 5th international conference on achieving quality in software. Venezia.
Edelstein, O., Farchi, E., Goldin, E., Nir, Y., Ratsaby, G., & Ur, S. (2003). Framework for testing multi-threaded java programs. Concurrency and Computation: Practice and Experience, 15(3–5), 485– 499.
Farchi, E., Nir, Y., & Ur, S. (2003). Concurrent bug patterns and how to test them. In 17th international parallel and distributed processing symposium (IPDPS 2003) - workshop on parallel and distributed systems: testing and debugging (pp. 286–293). Nice: IEEE Computer Society.
Foreman, L.M., & Zweben, S.H. (1993). A study of the effectiveness of control and data flow testing strategies. Journal of Systems and Software, 21(3), 215–228.
Frankl, F.G., & Weyuker, E.J. (1986). Data flow testing in the presence of unexecutable paths. In Workshop on software testing (pp. 4–13). Banff.
Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association, 32(200), 675–701.
Giacometti, C., Souza, S.R.S., & Souza, P.S.L. (2002). Teste de mutação para a validação de aplicações concorrentes usando PVM. REIC. Revista Eletrônica de Iniciação Científica, v. II, n. III.
Grama, A., Karypis, G., Kumar, V., & Gupta, A. (2003). Introduction to parallel computing, 2nd Edn. Reading: Addison Wesley.
Hong, S., Staats, M., Ahn, J., Kim, M., & Rothermel, G. (2013). The impact of concurrent coverage metrics on testing effectiveness. In 2013 IEEE 6th international conference on software testing, verification and validation (pp. 232–241).
Hong, S., Staats, M., Ahn, J., Kim, M., & Rothermel, G. (2015). Are concurrency coverage metrics effective for testing: a comprehensive empirical investigation. Software Testing Verification and Reliability, 25(4), 334–370.
Höst, M, Regnell, B., & Wohlin, C. (2000). Using students as subjects—a comparative study of students and professionals in lead-time impact assessment. Empirical Software Engineering, 5(3), 201–214.
Hutchins, M., Foster, H., Goradia, T., & Ostrand, T. (1994). Experiments of the effectiveness of dataflow and control flow based test adequacy criteria. In 16th international conference on software engineering, ICSE ’94 (pp. 191–200). Los Alamitos: IEEE.
Jalbert, N., & Sen, K. (2010). A trace simplification technique for effective debugging of concurrent programs. In 18th ACM SIGSOFT international symposium on foundations of software engineering, FSE ’10 (pp. 57–66). New York: ACM.
Lamport, L. (1978). Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 21(7), 558–565.
Lei, Y., & Carver, R.H. (2006). Reachability testing of concurrent programs. IEEE Transactions on Software Engineering, 32(6), 382–403.
Li, N., Praphamontripong, U., & Offutt, J. (2009). An experimental comparison of four unit test criteria: Mutation, edge-pair, all-uses and prime path coverage. In Second international conference on software testing verification and validation, Denver, Colorado, USA, April 1-4, 2009, Workshops Proceedings (pp. 220–229). Los Alamitos: IEEE Computer Society.
Lu, S., Park, S., Seo, E., & Zhou, Y. (2008). Learning from mistakes: a comprehensive study on real-world concurrency bug characteristics. SIGOPS Operating Systems Review, 42, 329–339.
Lu, S., Park, S., & Zhou, Y. (2012). Finding atomicity-violation bugs through unserializable interleaving testing. IEEE Transactions on Software Engineering, 38(4), 844–860.
Mathur, A.P., & Wong, E.W. (1993). An empirical comparison of mutation and data flow based test adequacy criteria. Journal of Software Testing, Verification, and Reliability, 4(1), 9–31.
Mathur, A.P., & Wong, W.E. (1994). An empirical comparison of data flow and mutation-based test adequacy criteria. The Journal of Software Testing, Verification and Reliability, 4(1), 9–31.
McCabe, T. (1976). A complexity measure. IEEE Transactions on Software Engineering, 2, 308–320.
Melo, S.M., Souza, S.R.S., & L, S.P.S. (2012). Structural testing for multithreaded programs: an experimental evaluation of the cost, strength and effectiveness. In: 24th international conference on software engineering & knowledge engineering (SEKE’2012), July, 2012 (Vol. 1–3, pp. 476–479). San Francisco Bay.
Melo, S.M., Souza, S.R.S., Silva, R.A., & Souza, P.S.L. (2015). Concurrent software testing in practice: a catalog of tools. In Proceedings of the 6th international workshop on automating test case design, selection and evaluation, A-TEST 2015 (pp. 31–40). New York: ACM.
Melo, S.M., Souza, P.S.L., & Souza, S.R.S. (2016). Towards an empirical study design for concurrent software testing. In Fourth international workshop on software engineering for high performance computing in computational science and engineering, SC ’16 (p. 49). Salt Lake City: IEEE Press.
Mühlenfeld, A, & Wotawa, F. (2007). Fault detection in multi-threaded C++ server applications. Electronic Notes in Theoretical Computer Science, 174(9), 5–22.
Musuvathi, M., Qadeer, S., & Ball, T. (2007). Chess: a systematic testing tool for concurrent software. Tech. Rep. MSR-TR-2007-149, Microsoft Research.
Ntafos, S.C. (1988). A comparison of some structural testing strategies. IEEE Transactions on Software Engineering, 14(6), 868–873.
Offutt, A.J., Pan, J., Tewary, K., & Zhang, T. (1996). An experimental evaluation of data flow and mutation testing. Software Practice and Experience, 26(2), 165–176.
Rapps, S., & Weyuker, E.J. (1985). Selecting software test data using data flow information. IEEE Transactions on Software Engineering, 11(4), 367–375. doi:10.1109/TSE.1985.232226.
Rungta, N., & Mercer, E.G. (2009). A meta heuristic for effectively detecting concurrency errors. Lecture Notes in Computer Science LNCS, 5394, 23–37.
Sarmanho, F.S., Souza, P.S., Souza, S.R., & Simão, AS (2008). Structural testing for semaphore-based multithread programs. In Proceedings of the 8th international conference on computational science, Part I, ICCS ’08 (pp. 337–346). Berlin: Springer.
Silva, R.A., do Rocio Senger de Souza, S., & de Souza, P.S.L. (2012). Mutation operators for concurrent programs in MPI. In 13th Latin American test workshop, LATW 2012, April 10–13, 2012 (pp. 1–6). Quito.
Simao, A.S., Vincenzi, A.M.R., Maldonado, J.C., & Santana, A.C.L. (2003). A language for the description of program instrumentation and the automatic generation of instrumenters. CLEI Electronic Journal, 6(1).
Society, I.C., Bourque, P., & Fairley, R.E. (2014). Guide to the software engineering body of knowledge (SWEBOK(R)): Version 3.0, 3rd Edn. Los Alamitos: IEEE Computer Society Press.
Souza, S., Vergilio, S., Souza Pao, A.S., Bliscosque, T., Lima, A., & Hausen, A. (2005). Valipar: a testing tool for message-passing parallel programs. In International conference on software knowledge and software engineering (SEKE05) (pp. 386–391). Taipei-Taiwan.
Souza, S., Sugeta, T., Fabbri, S., Masiero, P., & Maldonado, J. (2007). Coverage testing criteria for statecharts specifications validation. Software, Testing, Verification and Reliability Submitted.
Souza, P.L., Sawabe, E.T., Simão, AS, Vergilio, S.R., & Souza, S.R.S. (2008a). ValiPVM—a graphical tool for structural testing of PVM programs. In Proceedings of the 15th European PVM/MPI users’ group meeting on recent advances in parallel virtual machine and message passing interface (pp. 257–264). Berlin: Springer.
Souza, S.R.S., Vergilio, S.R., Souza, P.S.L., Simão, AS, & Hausen, A.C. (2008b). Structural testing criteria for message-passing parallel programs. Concurrency and Computation: Practice and Experience, 20, 1893–1916.
Souza, S.R.S., Prado, M.P., Barbosa, E.F., & Maldonado, J.C. (2012b). An experimental study to evaluate the impact of the programming paradigm in the testing activity. CLEI Electronic Journal (Online), 15(1), 4–4.
Souza, P.S.L., Souza, S.R.S., & Zaluska, E. (2012a). Structural testing for message-passing concurrent programs: an extended test model. Concurrency and Computation: Practice and Experience, 26(1), 21–50.
Souza, P.S., Souza, S.S., Rocha, M.G., Prado, R.R., & Batista, R.N. (2013). Data flow testing in concurrent programs with message passing and shared memory paradigms. Procedia Computer Science, 18, 149–158.
Souza, S.R.S., Souza, P.S.L., Brito, M.A.S., da Silva Simão, A, & Zaluska, E. (2015a). Empirical evaluation of a new composite approach to the coverage criteria and reachability testing of concurrent programs. Software Testing, Verification and Reliability, 25(3), 310–332.
Souza, S.R.S., Souza, P.S.L., Brito, M.A.S., Simao, A.S., & Zaluska, E.J. (2015b). Empirical evaluation of a new composite approach to the coverage criteria and reachability testing of concurrent programs. Software Testing, Verification and Reliability, 25(3), 310–332.
Takahashi, J., Kojima, H., & Furukawa, Z. (2008). Coverage based testing for concurrent software. In 28th international conference on distributed computing systems workshops. ICDCS’ 08 (pp. 533–538). Beijing: IEEE.
Tanenbaum, A.S. (1995). Distributed operating systems. Upper Saddle River: Prentice-Hall, Inc.
Taylor, R.N., Levine, D.L., & Kelly, C. (1992). Structural testing of concurrent programs. IEEE Transaction Software Engineering, 18(3), 206–215.
Valgrind-Developers (2014). Valgrind-3.6.1. http://valgrind.org/. Accessed 16 December 2014.
Vegas, S., & Basili, V. (2005). A characterisation schema for software testing techniques. Empirical Software Engineering, 10(4), 437–466. doi:10.1007/s10664-005-3862-1.
Vergilio, S.R., Souza, S.R.S., & Souza, P.S.L. (2005). Coverage testing criteria for message-passing parallel programs. In LATW2005 - 6th IEEE Latin-American test workshop (pp. 161–166). Salvador.
Vergilio, S.R., Maldonado, J.C., & Jino, M. (2006). Infeasible paths in the context of data flow based testing criteria: identification, classification and prediction. Journal of the Brazilian Computer Society, 12(1), 73–88.
Vos, T.E.J., Marín, B, Escalona, M.J., & Marchetto, A. (2012). A methodological framework for evaluating software testing techniques and tools. In 2012 12th international conference on quality software (pp. 230–239). doi:10.1109/QSIC.2012.16.
Wang, C., Chaudhuri, S., Gupta, A., & Yang, Y. (2009). Symbolic pruning of concurrent program executions. In 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering, ESEC/FSE ’09 (pp. 23–32). New York: ACM.
Wang, H., Liu, T., Guan, X., Shen, C., Zheng, Q., & Yang, Z. (2017). Dependence guided symbolic execution. IEEE Transactions on Software Engineering, 43(3), 252–271.
Wesonga, S., Mercer, E.G., & Rungta, N. (2011). Guided test visualization: making sense of errors in concurrent programs. In 26th IEEE/ACM international conference on automated software engineering, ASE (pp. 624–627). Washington, DC: IEEE.
Weyuker, E.J. (1990). The cost of data flow testing: an empirical study. IEEE Transactions on Software Engineering, 16(2), 121–128.
Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics, 1, 80–83.
Wohlin, C., Runeson, P., Höst, M, Ohlsson, M.C., Regnell, B., & Wesslén, A. (2000). Experimentation in software engineering: an introduction. Norwell: Kluwer Academic Publishers.
Wong, W.E., Lei, Y., & Ma, X. (2005). Effective generation of test sequences for structural testing of concurrent programs. In 10th IEEE international conference on engineering of complex computer systems (ICECCS’05) (pp. 539–548). Los Alamitos: IEEE. doi:10.1109/ICECCS.2005.37.
Xiao, X., Xie, T., Tillmann, N., & Halleux, J. (2011). Precise identification of problems for structural test generation. In Proceedings of the 33rd international conference on software engineering (ICSE, 2011) (pp. 611–620). Honolulu: IEEE.
Yang, Y. (2014). Inspect: a framework for dynamic verification of multithreaded C programs. http://www.cs.utah.edu/yuyang/inspect/. Accessed 16 December 2014.
Yang, R.D., & Chung, C.G. (1992). Path analysis testing of concurrent programs. Information and Software Technology, 34, 43–56.
Yang, C.S., & Pollock, L.L. (1997). The challenges in automated testing of multithreaded programs. In 14th International conference on testing computer software (pp. 157–166).
Yang, C.S.D., & Pollock, L.L. (2003). All-uses testing of shared memory parallel programs. Software Testing, Verification and Reliability Journal, 13, 3–24.
Yang, C.S.D., Souter, A.L., & Pollock, L.L. (1998). All-du-path coverage for parallel programs. SIGSOFT Software Engineering Notes, 23, 153–162.
Yang, Y., Chen, X., Gopalakrishnan, G., & Kirby, R.M. (2008). Efficient stateful dynamic partial order reduction. In K. Havelund, R. Majumdar, J. Palsberg (Eds.), SPIN, Lecture notes in computer science (Vol. 5156, pp. 288–305). Springer.
Yastrebenetsky, P., & Trakhtenbrot, M. (2011). Analysis of applicability for synchronization complexity metric. In 18th IEEE international conference and workshops on engineering of computer-based systems, ECBS ’11 (pp. 24–33). Washington, DC: IEEE.
Zhu, H. (1996). A formal analysis of the subsume relation between software test adequacy criteria. IEEE Transactions on Software Engineering, 22, 248–255.
Acknowledgements
The authors acknowledge the State of São Paulo Research Foundation - FAPESP, for the financial support (under processes no. 2010/04042-1, 2013/05046-9, 2013/01818-7 and 2015/23653-5) provided to this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Melo, S.M., de Souza, S.d.R.S., Sarmanho, F.S. et al. Contributions for the structural testing of multithreaded programs: coverage criteria, testing tool, and experimental evaluation. Software Qual J 26, 921–959 (2018). https://doi.org/10.1007/s11219-017-9376-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-017-9376-4