Abstract
Software development is a knowledge-intensive industry. For this reason, concentration of knowledge in software projects tends to be very risky, which increases the relevance of strategies that reveal how source code knowledge is distributed among team members. The truck factor (also known as the bus factor) is an increasingly popular concept—proposed by practitioners—that indicates the minimal number of developers that have to be hit by a truck (or leave the team) before a project is incapacitated. Therefore, it is a measure that reveals the concentration of knowledge and the key developers in a project. Due to the importance of this concept, algorithms have been proposed to automatically compute truck factors, using maintenance activity data extracted from version control systems. However, we still lack large studies that assess the results of truck factor algorithms. To fulfill this gap in the literature, this paper describes the results of three empirical studies. In the first study, we validate the results produced by three algorithms to estimate truck factors. To this purpose, we build an oracle of truck factors, gathered via a survey with 35 open-source project teams. In the second study, we provide a comparison between truck factors and core developers, a related concept commonly used to denote the key developers of open-source projects. Our results indicate that truck factor developers are in most cases a subset of core developers. Finally, as the algorithms proposed so far are based in commit data, in the third study, we investigate other factors that may impact the computation of truck factors.
Similar content being viewed by others
Notes
Findbugs is a popular open-source bug-finding tool for Java systems): https://mailman.cs.umd.edu/pipermail/findbugs-discuss/2016-November/004321.html
We also ran experiments for 100,000 samples. After increasing the number of analyzed samples, the results presented by the RIG algorithm did not show significant improvements: 13 systems presented error = 0, and 21 systems have |error| ≤ 1.
Some answers mentioned more than one role, so the sum exceeds 100%.
References
Avelino, G., Passos, L., Hora, A., Valente, M.T. (2016). A novel approach for estimating truck factors. In: 24th international conference on program comprehension (ICPC), pp. 1–10.
Avelino, G., Constantinou, E., Valente, M.T., Serebrenik, A. (2019a). On the abandonment and survival of open source projects: An empirical investigation. In: 13th international symposium on empirical software engineering and measurement(ESEM), pp. 1–11.
Avelino, G., Passos, L., Hora, A., Valente, M.T. (2019b). Measuring and analyzing code authorship in 1 + 118 open source projects. Science of Computer Programming, 176(1), 14–32.
Beck, K., & Andres, C. (2004). Extreme programming explained: Embrace change. Addison-Wesley Professional.
Bird, C., Nagappan, N., Murphy, B., Gall, H., Devanbu, P. (2011). Don’t touch my code!: examining the effects of ownership on software quality. In: 19th ACM SIGSOFT international symposium on foundations of software engineering (FSE), pp. 4–14.
Borges, H., & Valente, M.T. (2018). What’s in a GitHub star? understanding repository starring practices in a social coding platform. Journal of Systems and Software, 146, 112–129.
Borges, H., Hora, A., Valente, M.T. (2016). Understanding the factors that impact the popularity of GitHub repositories. In: 32nd international conference on software maintenance and evolution (ICSME), pp. 334–344.
Bowler, M. (2005). Truck factor. Online. http://www.agileadvice.com/2005/05/15/agilemanagement/truck-factor/, date Accessed: December 03 2016.
Coelho, J., & Valente, M.T. (2017). Why modern open source projects fail. In: 25th international symposium on the foundations of software engineering (FSE), pp. 186–196.
Cosentino, V., Izquierdo, J.L.C., Cabot, J. (2015). Assessing the bus factor of git repositories. In: 22nd international conference on software analysis, evolution, and reengineering (SANER), pp. 499–503.
Ferreira, M., Valente, M.T., Ferreira, K. (2017). A comparison of three algorithms for computing truck factors. In: 25th international conference on program comprehension (ICPC), pp. 207–217.
Foucault, M., Falleri, J., Blanc, X. (2014). Code ownership in open-source software. In: 18th international conference on evaluation and assessment in software engineering (EASE), pp. 1–9.
Fritz, T., Ou, J., Murphy, G.C., Hill, E. (2010). A degree-of-knowledge model to capture source code familiarity. In: 32nd international conference on software engineering (ICSE), pp. 385–394.
Fritz, T., Murphy, G.C., Murphy-Hill, E., Ou, J., Hill, E. (2014). Degree-of-knowledge: modeling a developer’s knowledge of code. ACM Transactions on Software Engineering and Methodology, 23(2), 1–42.
Joblin, M., Apel, S., Hunsen, C., Mauerer, W. (2017). Classifying developers into core and peripheral: An empirical study on count and network metrics. In: 39th international conference on software engineering (ICSE), pp. 1–12.
Mens, T. (2016). An ecosystemic and socio-technical view on software maintenance and evolution. In: 32nd international conference on software maintenance and evolution (ICSME), pp. 1–8.
Mockus, A. (2010). Organizational volatility and its effects on software defects. In: 18th ACM SIGSOFT international symposium on foundations of software engineering (FSE), pp. 117–126.
Mockus, A., Fielding, R.T., Herbsleb, J.D. (2002). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology, 11(3), 309–346.
Rahman, F., & Devanbu, P. (2011). Ownership, experience and defects: A fine-grained study of authorship. In: 33rd international conference on software engineering (ICSE), pp. 491–500.
Ricca, F., & Marchetto, A. (2010). Are heroes common in FLOSS projects? In: 4th international symposium on empirical software engineering and measurement (ESEM), pp. 1–4.
Ricca, F., Marchetto, A., Torchiano, M. (2011). On the difficulty of computing the truck factor. In: 12th product-focused software process improvement (PROFES), pp. 337–351.
Rigby, P.C., Zhu, Y.C., Donadelli, S.M., Mockus, A. (2016). Quantifying and mitigating turnover-induced knowledge loss: case studies of Chrome and a project at Avaya. In: 38th International Conference on Software Engineering (ICSE), pp. 1006–1016.
Torchiano, M., Ricca, F., Marchetto, A. (2011). Is my project’s truck factor low?: Theoretical and empirical considerations about the truck factor threshold. In: 2nd international workshop on emerging trends in software metrics (WETSoM), pp. 12–18.
Williams, L., & Kessler, R. (2003). Pair programming illuminated. Addison Wesley.
Yamashita, K., McIntosh, S., Kamei, Y., Hassan, A.E., Ubayashi, N. (2015). Revisiting the applicability of the pareto principle to core development teams in open source software projects. In: 14th international workshop on principles of software evolution (IWPSE), pp. 46–55.
Ye, Y., & Kishida, K. (2003). Toward an understanding of the motivation open source software developers. In: 25th international conference on software engineering (ICSE), pp. 419–429.
Zazworka, N., Stapel, K., Knauss, E., Shull, F., Basili, V.R., Schneider, K. (2010). Are developers complying with the process: an XP study. In: 4th international symposium on empirical software engineering and measurement (ESEM), pp. 1–10.
Funding
This research is supported by grants from FAPEMIG, CAPES, and CNPq.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ferreira, M., Mombach, T., Valente, M.T. et al. Algorithms for estimating truck factors: a comparative study. Software Qual J 27, 1583–1617 (2019). https://doi.org/10.1007/s11219-019-09457-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11219-019-09457-2