[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3416508.3417116acmconferencesArticle/Chapter ViewAbstractPublication PagespromiseConference Proceedingsconference-collections
research-article

Identifying key developers using artifact traceability graphs

Published: 08 November 2020 Publication History

Abstract

Developers are the most important resource to build and maintain software projects. Due to various reasons, some developers take more responsibility, and this type of developers are more valuable and indispensable for the project. Without them, the success of the project would be at risk. We use the term key developers for these essential and valuable developers, and identifying them is a crucial task for managerial decisions such as risk assessment for potential developer resignations. We study key developers under three categories: jacks, mavens and connectors. A typical jack (of all trades) has a broad knowledge of the project, they are familiar with different parts of the source code, whereas mavens represent the developers who are the sole experts in specific parts of the projects. Connectors are the developers who involve different groups of developers or teams. They are like bridges between teams.
To identify key developers in a software project, we propose to use traceable links among software artifacts such as the links between change sets and files. First, we build an artifact traceability graph, then we define various metrics to find key developers. We conduct experiments on three open source projects: Hadoop, Hive and Pig. To validate our approach, we use developer comments in issue tracking systems and demonstrate that the identified key developers by our approach match the top commenters up to 92%.

References

[1]
Amritanshu Agrawal, Akond Rahman, Rahul Krishna, Alexander Sobran, and Tim Menzies. 2018. We don't need another hero?: the impact of heroes on software development. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. ACM, 245-253.
[2]
Mohammad Y Allaho and Wang-Chien Lee. 2013. Analyzing the social ties and structure of contributors in open source software community. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 56-60.
[3]
Guilherme Avelino, Eleni Constantinou, Marco Tulio Valente, and Alexander Serebrenik. 2019. On the abandonment and survival of open source projects: An empirical investigation. In 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). IEEE, 1-12.
[4]
Guilherme Avelino, Leonardo Passos, Andre Hora, and Marco Tulio Valente. 2016. A novel approach for estimating truck factors. In 2016 IEEE 24th International Conference on Program Comprehension (ICPC). IEEE, 1-10.
[5]
Christian Bird, Alex Gourley, Prem Devanbu, Michael Gertz, and Anand Swaminathan. 2006. Mining email social networks. In Proceedings of the 2006 international workshop on Mining software repositories. 137-143.
[6]
Ulrik Brandes. 2001. A faster algorithm for betweenness centrality. Journal of mathematical sociology 25, 2 ( 2001 ), 163-177.
[7]
H Alperen Cetin. 2019. Identifying the most valuable developers using artifact traceability graphs. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1196-1198.
[8]
Jinghui Cheng and Jin LC Guo. 2019. Activity-based analysis of open source software contributors: roles and dynamics. In 2019 IEEE/ACM 12th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE). IEEE, 11-18.
[9]
Valerio Cosentino, Javier Luis Cánovas Izquierdo, and Jordi Cabot. 2015. Assessing the bus factor of Git repositories. In 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER). IEEE, 499-503.
[10]
Kevin Crowston, Kangning Wei, Qing Li, and James Howison. 2006. Core and periphery in free/libre and open source software team communications. In Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06), Vol. 6. IEEE, 118a-118a.
[11]
Enrico Di Bella, Alberto Sillitti, and Giancarlo Succi. 2013. A multivariate classiifcation of open source developers. Information Sciences 221 ( 2013 ), 72-83.
[12]
Mívian Ferreira, Thaís Mombach, Marco Tulio Valente, and Kecia Ferreira. 2019. Algorithms for estimating truck factors: a comparative study. Software Quality Journal 27, 4 ( 2019 ), 1583-1617.
[13]
Linton C Freeman. 1978. Centrality in social networks conceptual clarification. Social networks 1, 3 ( 1978 ), 215-239.
[14]
Thomas Fritz, Gail C Murphy, Emerson Murphy-Hill, Jingwen Ou, and Emily Hill. 2014. Degree-of-knowledge: Modeling a developer's knowledge of code. ACM Transactions on Software Engineering and Methodology (TOSEM) 23, 2 ( 2014 ), 1-42.
[15]
Malcolm Gladwell. 2006. The tipping point: How little things can make a big diference. Little, Brown.
[16]
Mitchell Joblin, Sven Apel, Claus Hunsen, and Wolfgang Mauerer. 2017. Classifying developers into core and peripheral: An empirical study on count and network metrics. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, 164-174.
[17]
Takeshi Kakimoto, Yasutaka Kamei, Masao Ohira, and Kenichi Matsumoto. 2006. Social network analysis on communications for knowledge collaboration in oss communities. In Proceedings of the International Workshop on Supporting Knowledge Collaboration in Software Development (KCSD'06). Citeseer, 35-41.
[18]
Makrina Viola Kosti, Robert Feldt, and Lefteris Angelis. 2016. Archetypal personalities of software engineers and their work preferences: a new perspective for empirical studies. Empirical Software Engineering 21, 4 ( 2016 ), 1509-1532.
[19]
Audris Mockus. 2010. Organizational volatility and its efects on software defects. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering. 117-126.
[20]
Gustavo Ansaldi Oliva, José Teodoro da Silva, Marco Aurélio Gerosa, Francisco Werther Silva Santana, Cláudia Maria Lima Werner, Cleidson Ronald Botelho de Souza, and Kleverton Carlos Macedo de Oliveira. 2015. Evolving the system's core: a case study on the identification and characterization of key developers in Apache Ant. Computing and Informatics 34, 3 ( 2015 ), 678-724.
[21]
Michael Rath and Patrick Mäder. 2019. The SEOSS 33 dataset-Requirements, bug reports, code history, and trace links for entire projects. Data in brief 25 ( 2019 ), 104005.
[22]
Peter C Rigby and Christian Bird. 2013. Convergent contemporary software peer review practices. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. 202-212.
[23]
Peter C Rigby, Yue Cai Zhu, Samuel M Donadelli, and Audris Mockus. 2016. Quantifying and mitigating turnover-induced knowledge loss: case studies of Chrome and a project at Avaya. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, 1006-1016.
[24]
Per Runeson and Martin Höst. 2009. Guidelines for conducting and reporting case study research in software engineering. Empirical software engineering 14, 2 ( 2009 ), 131.
[25]
Caitlin Sadowski, Emma Söderberg, Luke Church, Michal Sipko, and Alberto Bacchelli. 2018. Modern code review: a case study at google. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. 181-190.
[26]
Jing Wu and Khim Yong Goh. 2009. Evaluating longitudinal success of open source software projects: A social network perspective. In 2009 42nd Hawaii International Conference on System Sciences. IEEE, 1-10.
[27]
Minghui Zhou and Audris Mockus. 2012. What make long term contributors: Willingness and opportunity in OSS community. In 2012 34th International Conference on Software Engineering (ICSE). IEEE, 518-528.

Cited By

View all
  • (2024)Balanced knowledge distribution among software development teams—Observations from open‐ and closed‐source software developmentJournal of Software: Evolution and Process10.1002/smr.2655Online publication date: 13-Feb-2024
  • (2022)Metrics to quantify software developer experienceProceedings of the 37th ACM/SIGAPP Symposium on Applied Computing10.1145/3477314.3507304(1562-1569)Online publication date: 25-Apr-2022
  • (2022)When traceability goes awryJournal of Systems and Software10.1016/j.jss.2022.111389192:COnline publication date: 1-Oct-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering
November 2020
80 pages
ISBN:9781450381277
DOI:10.1145/3416508
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. artifact traceability graphs
  2. connector
  3. developer categories
  4. jack
  5. key developers
  6. maven
  7. most valuable developers
  8. social networks

Qualifiers

  • Research-article

Conference

PROMISE '20
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)8
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Balanced knowledge distribution among software development teams—Observations from open‐ and closed‐source software developmentJournal of Software: Evolution and Process10.1002/smr.2655Online publication date: 13-Feb-2024
  • (2022)Metrics to quantify software developer experienceProceedings of the 37th ACM/SIGAPP Symposium on Applied Computing10.1145/3477314.3507304(1562-1569)Online publication date: 25-Apr-2022
  • (2022)When traceability goes awryJournal of Systems and Software10.1016/j.jss.2022.111389192:COnline publication date: 1-Oct-2022
  • (2022)Analyzing developer contributions using artifact traceability graphsEmpirical Software Engineering10.1007/s10664-022-10129-227:3Online publication date: 1-May-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media