In this study, we investigated the functional form of the size-defect relationship for software modules through replicated studies conducted on ten open-source products. We consistently observed a power-law relationship where defect proneness increases at a slower rate compared to size. Therefore, smaller modules are proportionally more defect prone. We externally validated the application of our results for two commercial systems. Given limited and fixed resources for code inspections, there would be an impressive improvement in the cost-effectiveness, as much as 341% in one of the systems, if a smallest-first strategy were preferred over a largest-first one. The consistent results obtained in this study led us to state a theory of relative defect proneness (RDP): In large-scale software systems, smaller modules will be proportionally more defect-prone compared to larger ones. We suggest that practitioners consider our results and give higher priority to smaller modules in their focused quality assurance efforts.
Webcite link: http://www.webcitation.org/5RqqbCKKm (cached Sep. 14, 2007)
Webcite link: http://www.webcitation.org/5Rqr0BSz8 (cached Sep. 14, 2007)
CVS was the source code control system used by the KOffice developers. Webcite link: http://www.webcitation.org/5RrT2BaV1 (cached Sep. 14, 2007)
Perl is a stable, cross platform programming language. Webcite link: http://www.webcitation.org/5RrTDEdYV (cached Sep. 14, 2007)
We would like to thank Frank E. Harrell for extending and modifying some of the functionality in his Design package for us, Victor R. Basili for his helpful comments, Jeff Tian for providing data, the associate editor, Tim Menzies, for his guidance and suggestions, and the anonymous reviewers of this paper for their helpful and constructive feedback.
In this appendix, we explain how to calculate the RDP of the modules chosen by one inspection strategy with respect to those chosen by another inspection strategy. The first inspection strategy chooses m modules having sizes (in LOC), s 1,s 2,...,s m , and the second one chooses n modules having sizes, S 1,S 2,...,S n .
First, let us take a reference module with size C. Since we observed a logarithmic shape for the link function, following (2) and omitting the time parameter t to simplify the notation, the RDP of an individual module of size s with respect to this reference module at any time t would be e β(ln s − lnC). For each inspection strategy, we calculate the sum of the RDP of the selected individual modules with respect to the reference module. To find the RDP, we simply take the ratio of these sums:
Koru, A.G., Emam, K.E., Zhang, D. et al. Theory of relative defect proneness. Empir Software Eng 13, 473–498 (2008). https://doi.org/10.1007/s10664-008-9080-x
DOI: https://doi.org/10.1007/s10664-008-9080-x