Abstract
The goals of cross-product reuse in a software product line (SPL) are to mitigate production costs and improve the quality. In addition to reuse across products, due to the evolutionary development process, a SPL also exhibits reuse across releases. In this paper, we empirically explore how the two types of reuse—reuse across products and reuse across releases—affect the quality of a SPL and our ability to accurately predict fault proneness. We measure the quality in terms of post-release faults and consider different levels of reuse across products (i.e., common, high-reuse variation, low-reuse variation, and single-use packages), over multiple releases. Assessment results showed that quality improved for common, low-reuse variation, and single-use packages as they evolved across releases. Surprisingly, within each release, among preexisting (‘old’) packages, the cross-product reuse did not affect the change and fault proneness. Cross-product predictions based on pre-release data accurately ranked the packages according to their post-release faults and predicted the 20 % most faulty packages. The predictions benefited from data available for other products in the product line, with models producing better results (1) when making predictions on smaller products (consisting mostly of common packages) rather than on larger products and (2) when trained on larger products rather than on smaller products.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
A fault is defined as an accidental condition, which if encountered, may cause the system or system component to fail to perform as required. We avoid using the term defect, which is used inconsistently in the literature to refer in some cases to both faults and failures and in other cases only to faults or perhaps, faults detected pre-release.
For a comprehensive survey of binary classification studies the reader is referred to the recent paper by Hall et al. (2012).
For this study, a package is considered faulty if any file contained in that package exhibited one or more post-release faults.
Thompson and Heimdahl (2003) proposed a set-theoretic approach to represent requirements reuse in product line engineering, which described the boundaries of sets as commonalities and the members within the sets as products. The approach taken in our previous work (Devine et al. 2012) and used here is complementary to Thompson and Heimdahl (2003). Specifically, it is used to illustrate the amount of shared code at different levels of cross-product reuse; the elements within the sets are packages of the SPL, and the boundaries of sets define the products.
pserver:anonymous@dev.eclipse.org:2401/cvsroot
The complexity measure used by SourceMonitor approximately follows the definition by McConnell (2004).
Some form of \(k\)-fold cross validation is commonly employed in machine learning in general and software engineering in particular. Cross validation is the process of splitting the data randomly into \(k\) groups, and then predicting values for the \(k\)-th group by building a model on the other \(k-1\) groups. This is repeated using each of the \(k\) groups as a testing group and the average value of the predicted variable is reported. Cross validation may provide better results than building models and predicting on disjoint data sets (as was done in this paper) because averaging the results over \(k\) repeated trials offers more consistent, flattened end results than one achieved via building models and predicting on disjoint sets.
Many software metrics are highly correlated to each other, which engenders a problem that is commonly referred to as multicollinearity. To quote Kutner et al. (2004) “The fact that some or all predictor variables are correlated among themselves does not, in general, inhibit our ability to obtain a good fit nor does it tend to affect inferences about mean responses or predictions of new observations ...” However, multicollinearity may cause the estimated regression coefficients to have a large sampling variability and thus affect explanatory studies.
Kendall’s \(\tau _b\) approaches the normal distribution quite rapidly so that the normal approximation is better for Kendall’s \(\tau _b\) than it is for Spearman’s \(\rho \). Another advantage of Kendall’s \(\tau _b\) is its direct and simple interpretation in terms of probabilities of observing concordant pairs (both numbers of one observation are larger than their respective members of the other observation) and discordant pairs (the two numbers in one observation differ in opposite directions from the respective members in the other observation).
If the Friedman test results in rejection of the null hypothesis that there is no difference, a post hoc multiple comparison test is used to identify where the difference is. Alternatively, instead of the Friedman test, one can use the Page test which is used to test the null hypothesis that there is no statistically significant difference in several related samples (i.e., \(H_0: \mu _1 = \mu _2 = \mu _3\)) against the ordered alternative that the samples differ in a specified direction, with at least one inequality (i.e., \(H_1: \mu _1 \ge \mu _2 \ge \mu _3\)).
References
Agresti, A.: Analysis of Ordinal Categorical Data. John Wiley and Sons Inc, Hoboken, NJ (2010)
Andersson, C., Runeson, P.: A replicated quantitative analysis of fault distributions in complex software systems. IEEE Trans. Softw. Eng. 33, 273–286 (2007)
Bell, R.M., Ostrand, T.J., Weyuker, E.J.: Looking for bugs in all the right places. In: Proceedings of the 2006 International Symposium on Software Testing and Analysis, ISSTA ’06, pp. 61–72 (2006)
Bibi, S., Tsoumakas, G., Stamelos, I., Vlahvas, I.: Software defect prediction using regression via classification. In: Proceedings of the IEEE International Conference on Computer Systems and Applications, AICCSA ’06, pp. 330–336 (2006)
Bingham, N.H., Fry, J.M.: Regression: Linear Models in Statistics, 1st edn. Springer-Verlag, London (2010)
Boehm, B., Basili, V.R.: Software defect reduction top 10 list. Computer 34, 135–137 (2001)
Breivold, H.P., Crnkovic, I., Larsson, M.: A systematic review of software architecture evolution research. Info. Softw. Technol. 54(1), 16–40 (2012)
Chastek, G., McGregor, J., Northrop, L.: Observations from viewing Eclipse as a product line. In: Proceedings of the 3rd International Workshop on Open Source Software and Product Lines, pp. 1–6 (2007)
D’Ambros, M., Lanza, M., Robbes, R.: On the relationship between change coupling and software defects. In: Proceedings of the 16th Working Conference on Reverse Engineering, WCRE ’09, pp. 135–144 (2009)
D’Ambros, M., Lanza, M., Robbes, R.: An extensive comparison of bug prediction approaches. In: Proceedings of the 7th IEEE Working Conference on Mining Software Repositories, MSR ’10, pp. 31–41 (2010)
D’Ambros, M., Lanza, M., Robbes, R.: Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir. Softw. Eng. 17, 531–577 (2012)
Devine, T., Goseva-Popstajanova, K., Krishnan, S., Lutz, R., Li, J.: An empirical study of pre-release software faults in an industrial product line. In: Proceedings of the 5th IEEE International Conference on Software Testing, Verification and Validation, ICST ’12, pp. 181–190 (2012)
Fenton, N.E., Ohlsson, N.: Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26, 797–814 (2000)
Frakes, W.B., Succi, G.: An industrial study of reuse, quality, and productivity. J. Syst. Softw. 57, 99–106 (2001)
Gomaa, H.: Designing Software Product Lines with UML: From Use Cases to Pattern-Based Software Architectures. Addison Wesley Longman Publishing Co. Inc, Redwood City, CA (2004)
van Gurp, J., Prehofer, C., Bosch, J.: Comparing practices for reuse in integration-oriented software product lines and large open source software projects. Softw. Prac. Exper. 40(4), 285–312 (2010)
Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic review of fault prediction performance in software engineering. IEEE Trans. Softw. Eng. 38(6), 1276–1304 (2012)
Hamill, M., Goseva-Popstojanova, K.: Common trends in software fault and failure data. IEEE Trans. Softw. Eng. 35, 484–496 (2009)
He, Z., Shu, F., Yang, Y., Li, M., Wang, Q.: An investigation on the feasibility of cross-project defect prediction. Autom. Softw. Eng. 19(2), 167–199 (2012)
He, Z., Peters, F., Menzies, T., Yang, Y.: Learning from open-source projects: An empirical study on defect prediction. In: Proceedings of the ACM / IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM’13, pp. 45–54 (2013)
Kamei, Y., Matsumoto, S., Monden, A., Matsumoto, Ki., Adams, B., Hassan, A.E.: Revisiting common bug prediction findings using effort-aware models. In: Proceedings of the 2010 IEEE International Conference on Software Maintenance, ICSM ’10, pp. 1–10 (2010)
Kastro, Y., Bener, A.B.: A defect prediction method for software versioning. Softw. Qual. Control 16(4), 543–562 (2008)
Khoshgoftaar, T., Munson, J.: Predicting software development errors using software complexity metrics. IEEE J. Sel. Areas Commun. 8(2), 253–261 (1990)
Khoshgoftaar, T.M., Seliya, N.: Comparative assessment of software quality classification techniques: an empirical case study. Empir. Softw. Eng. 9(3), 229–257 (2004)
Kitchenham, B., Mendes, E.: Why comparative effort prediction studies may be invalid. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering, PROMISE ’09, pp. 4:1–4:5 (2009)
Kleinbaum, D.G., Kupper, L.L., Muller, K.E. (eds.): Applied regression analysis and other multivariable methods. PWS Publishing Co., Boston, MA (1988)
Krishnan, S., Lutz, R.R., Goseva-Popstojanova, K.: Empirical evaluation of reliability improvement in an evolving software product line. In: Proceedings of the 8th Working Conference on Mining Software Repositories, MSR ’11, pp. 103–112 (2011a)
Krishnan, S., Strasburg, C., Lutz, R.R., Goseva-Popstojanova, K.: Are change metrics good predictors for an evolving software product line? In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, PROMISE’11, pp. 7:1–7:10 (2011b)
Krishnan, S., Strasburg, C., Lutz, R.R., Goseva-Popstojanova, K., Dorman, K.S.: Predicting failure-proneness in an evolving software product line. Info. Softw. Technol. 55(8), 1479–1495 (2012)
Kutner, M.H., Nachtsheim, C.J., Neter, J.: Appl. Linear Regres. Models, forth edn. McGraw-Hill/Irwin, New York, NY (2004)
Laffra, C., Veys, N.: Where did Eclipse come from? http://wiki.eclipse.org/FAQ_Where_did_Eclipse_come_from%3F (2013). Accessed 5 Aug 2014
Li, P.L., Herbsleb, J., Shaw, M., Robinson, B.: Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc., In: Proceedings of the 28th International Conference on Software Engineering, ICSE ’06, pp. 413–422 (2006)
Lim, W.: Effects of reuse on quality, productivity, and economics. IEEE Trans. Softw. Eng. 11(5), 23–30 (1994)
Ma, Y., Luo, G., Zeng, X., Chen, A.: Transfer learning for cross-company software defect prediction. Info. Softw. Technol. 54(3), 248–256 (2012)
Mansfield, D.: CVSps-patchsets for CVS. http://www.cobite.com/cvsps (2012). Accessed 5 Aug 2014
McConnell, S.: Code Complete, 2nd edn. Microsoft Press, Redmond, WA (2004)
McCullagh, P., Nelder, J.: Generalized Linear Models. Monographs on Statistics and Applied Probability. Chapman and Hall, New York, NY (1983)
Mohagheghi, P., Conradi, R.: An empirical investigation of software reuse benefits in a large telecom product. ACM Trans. Softw. Eng. Method. 17, 13:1–13:31 (2008)
Mohagheghi, P., Conradi, R., Killi, O., Schwarz, H.: An empirical study of software reuse vs. defect-density and stability. In: Proceedings of the 26th International Conference on Software Engineering, ICSE ’04, pp. 282–291 (2004)
Moser, R., Pedrycz, W., Succi, G.: A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proceedings of the ACM/IEEE 30th International Conference on Software Engineering, ICSE ’08, pp. 181–190 (2008)
Nagappan, N., Ball, T., Zeller, A.: Mining metrics to predict component failures. In: Proceedings of the 28th International Conference on Software Engineering, ICSE ’06, pp. 452–461 (2006)
Nam, J., Pan, S.J., Kim, S.: Transfer defect learning. In: Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pp. 382–391 (2013)
Nelder, J.A., Wedderburn, R.W.M.: Generalized linear models. J. Royal Statist. Soc. Ser. A (General) 135(3), 370–384 (1972)
Norušis, M.J.: IBM SPSS Statistics 19 Advanced Statistical Procedures Companion. Prentice Hall, Upper Saddle River, NJ (2012)
Ohlsson, N., Alberg, H.: Predicting fault-prone software modules in telephone switches. IEEE Trans. Softw. Eng. 22(12), 886–894 (1996)
Ostrand, T.J., Weyuker, E.J.: The distribution of faults in a large industrial software system. In: Proceedings of the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA ’02, pp. 55–64 (2002)
Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Where the bugs are. In: Proceedings of the 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA’04, pp. 86–96 (2004)
Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Predicting the location and number of faults in large software systems. IEEE Trans. Softw. Eng. 31(4), 340–355 (2005)
Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Programmer-based fault prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering, PROMISE’10, pp. 19:1–19:10 (2010)
Pohl, K., Böckle, G.: Software Product Line Engineering: Foundations. Principles and Techniques. Springer-Verlag, Secaucus, NJ (2005)
Selby, R.: Enabling reuse-based software development of large-scale systems. IEEE Trans. Softw. Eng. 31(6), 495–510 (2005)
Shin, Y., Bell, R., Ostrand, T., Weyuker, E.: Does calling structure information improve the accuracy of fault prediction? In: Proceedings of the 6th IEEE International Working Conference on Mining Software Repositories, MSR ’09, pp. 61–70 (2009)
Shull, F.J., Carver, J.C., Vegas, S., Juristo, N.: The role of replications in empirical software engineering. Empir. Softw. Eng. 13(2), 211–218 (2008)
SourceMonitor (2011) Version 3.2. http://www.campwoodsw.com/sourcemonitor.html. Accessed 5 Aug 2014
Taylor, R.N.: The role of architectural styles in successful software ecosystems. In: Proceedings of the 17th International Software Product Line Conference, SPLC ’13, pp. 2–4 (2013)
Thomas, W.M., Delis, A., Basili, V.R.: An analysis of errors in a reuse-oriented development environment. J. Syst. Softw. 38, 211–224 (1997)
Thompson, J.M., Heimdahl, M.P.E.: Structuring product family requirements for n-dimensional and hierarchical product lines. Requir. Eng. 8(1), 42–54 (2003)
Turhan, B., Menzies, T., Bener, A.B., Di Stefano, J.: On the relative value of cross-company and within-company data for defect prediction. Empir. Softw. Eng. 14(5), 540–578 (2009)
van der Linden, F.: Applying open source software principles in product lines. Cepsus Upgrade Eur. J. Info. Prof. 10, 32–40 (2009)
van der Linden, F.: Open source practices in software product line engineering. In: Lucia, A., Ferrucci, F. (eds.) Software Engineering, Lecture Notes in Computer Science, vol. 7171, pp. 216–235. Springer, Berlin Heidelberg (2013)
Watanabe, S., Kaiya, H., Kaijiri, K.: Adapting a fault prediction model to allow inter language reuse. In: Proceedings of the 4th International Workshop on Predictor Models in Software Engineering, PROMISE ’08, pp. 19–24 (2008)
Weiss, D.M., Lai, C.T.R.: Software Product-Line Engineering: A Family-Based Software Development Process. Addison-Wesley Longman Publishing Co. Inc, Boston, MA (1999)
Weyuker, E.J., Ostrand, T.J., Bell, R.M.: Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models. Empir. Softw. Eng. 13(5), 539–559 (2008)
Zhang, W., Jarzabek, S.: Reuse without compromising performance: Industrial experience from RPG software product line for mobile devices. In: Software Product Lines, LNCS, vol. 3714, pp. 57–69 (2005)
Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for Eclipse. In: Proceedings of the 3rd International Workshop on Predictor Models in Software Engineering, PROMISE’07, p. 9 (2007)
Zimmermann, T., Nagappan, N., Gall, H., Giger, E., Murphy, B.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE ’09, pp. 91–100 (2009)
Acknowledgments
This work was supported in part by the National Science Foundation Grants 0916275 and 0916284 with funds from the American Recovery and Reinvestment Act of 2009 and by the WVU ADVANCE Sponsorship Program funded by the National Science Foundation ADVANCE IT Program award HRD-100797. Part of this work was performed while Robyn Lutz was visiting the California Institute of Technology.
Author information
Authors and Affiliations
Corresponding author
Appendix: Aggregation metrics
Appendix: Aggregation metrics
The static code and change metrics were collected at file-level and then were aggregated to the package level, as specified in Tables 7 and 8. As a result, each package was characterized by a vector \(\mathbf{m}\) of 112 metrics (i.e., features), where \(\mathbf{m}\left[ i \right] , i=1,\ldots ,73\) are static code metrics, while \(\mathbf{m}\left[ i \right] , i=74,\ldots ,112\) are change metrics.
Rights and permissions
About this article
Cite this article
Devine, T., Goseva-Popstojanova, K., Krishnan, S. et al. Assessment and cross-product prediction of software product line quality: accounting for reuse across products, over multiple releases. Autom Softw Eng 23, 253–302 (2016). https://doi.org/10.1007/s10515-014-0160-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10515-014-0160-4