[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
review-article

A systematic process for Mining Software Repositories: : Results from a systematic literature review

Published: 01 April 2022 Publication History

Abstract

Context:

Mining Software Repositories (MSR) is a growing area of Software Engineering (SE) research. Since their emergence in 2004, many investigations have analysed different aspects of these studies. However, there are no guidelines on how to conduct systematic MSR studies. There is a need to evaluate how MSR research is approached to provide a framework to do so systematically.

Objective:

To identify how MSR studies are conducted in terms of repository selection and data extraction. To uncover potential for improvement in directing systematic research and providing guidelines to do so.

Method:

A systematic literature review of MSR studies was conducted following the guidelines and template proposed by Mian et al. (which refines those provided by Kitchenham and Charters). These guidelines were extended and revised to provide a framework for systematic MSR studies.

Results:

MSR studies typically do not follow a systematic approach for repository selection, and many do not report selection or data extraction protocols. Furthermore, few manuscripts discuss threats to the study’s validity due to the selection or data extraction steps followed.

Conclusions:

Although MSR studies are evidence-based research, they seldom follow a systematic process. Hence, there is a need for guidelines on how to conduct systematic MSR studies. New guidelines and a template have been proposed, consolidating related studies in the MSR field and strategies for systematic literature reviews.

Highlights

Systematic literature review examining the process of MSR studies.
Provides a systematic approach for MSR research with a protocol template.
Proposes guidelines derived from existing processes, consolidating prior works.

References

[1]
Trautsch F., Herbold S., Makedonski P., Grabowski J., Addressing problems with replicability and validity of repository mining studies through a smart data platform, Empir. Softw. Eng. 23 (2018) 1036–1083,.
[2]
Hassan A.E., The road ahead for Mining Software repositories, in: 2008 Frontiers of Software Maintenance, 2008, pp. 48–57.
[3]
Felderer M., Jeschko F., A process for evidence-based engineering of domain-specific languages, in: Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018, in: EASE’18, Association for Computing Machinery, New York, NY, USA, 2018, pp. 169–174,.
[4]
Kamei Y., Zaidman A., Guest editorial: Mining software repositories 2018, Empir. Softw. Eng. (2018) 1–3.
[5]
Dong L., Liu B., Li Z., Wu O., Babar M.A., Xue B., A mapping study on mining software process, in: 2017 24th Asia-Pacific Software Engineering Conference (APSEC), IEEE, Nanjing, 2017, pp. 51–60,. URL: http://ieeexplore.ieee.org/document/8305927/.
[6]
Kalliamvakou E., Gousios G., Blincoe K., Singer L., German D.M., Damian D., An in-depth study of the promises and perils of mining GitHub, Empir. Softw. Eng. 21 (2016) 2035–2071,.
[7]
Kotti Z., Spinellis D., Standing on shoulders or feet? the usage of the MSR data papers, in: Proceedings of the 16th International Conference on Mining Software Repositories, in: MSR ’19, IEEE Press, Montreal, Quebec, Canada, 2019, pp. 565–576,.
[8]
Kitchenham B., Brereton P., A systematic review of systematic review process research in software engineering, Inf. Softw. Technol. 55 (2013) 2049–2075,. URL: http://www.sciencedirect.com/science/article/pii/S0950584913001560.
[9]
Mian P.G., Conte T., Natali A.C.C., de Almeida Biolchini J.C., Travassos G.H., A systematic review process for software engineering, in: Experimental Software Engineering Track (ESELAW), 2005, pp. 1–6. URL: https://api.semanticscholar.org/CorpusID:17417820.
[10]
Dybå T., Bergersen G., Sjøberg D., Evidence-based software engineering, in: Menzies T., Williams L., Zimmermann T. (Eds.), Perspectives on Data Science for Software Engineering, Morgan Kaufmann, Boston, 2016, pp. 149–153,. URL: http://www.sciencedirect.com/science/article/pii/B9780128042069000295.
[11]
Kitchenham B.A., Dyba T., Jorgensen M., Evidence-based software engineering, in: Proceedings of the 26th International Conference on Software Engineering, in: ICSE ’04, IEEE Computer Society, USA, 2004, pp. 273–281.
[12]
K. Petersen, N.B. Ali, Identifying strategies for study selection in systematic reviews and maps, in: 2011 International Symposium on Empirical Software Engineering and Measurement, 2011, pp. 351–354, https://doi.org/10.1109/ESEM.2011.46.
[13]
Petersen K., Feldt R., Mujtaba S., Mattsson M., Systematic mapping studies in software engineering, in: Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering, in: EASE’08, BCS Learning & Development Ltd., Swindon, GBR, 2008, pp. 68–77.
[14]
Brereton P., Kitchenham B.A., Budgen D., Turner M., Khalil M., Lessons from applying the systematic literature review process within the software engineering domain, J. Syst. Softw. 80 (2007) 571–583,. URL: http://www.sciencedirect.com/science/article/pii/S016412120600197X, Software Performance.
[15]
Farias M.A., Novais R., Júnior M.C.c., da Silva Carvalho L.s.P., Mendon CÇca M., Spí nola R.O., A systematic mapping study on mining software repositories, in: Proceedings of the 31st Annual ACM Symposium on Applied Computing, in: SAC ’16, Association for Computing Machinery, Pisa, Italy, 2016, pp. 1472–1479,.
[16]
Güemes-Peña D., López-Nozal C., Marticorena-Sánchez R., Maudes-Raedo J., Emerging topics in mining software repositories, Progress in Artifi. Intell. 7 (2018) 237–247,.
[17]
K. Chaturvedi, V. Sing, P. Singh, Tools in mining software repositories, in: 2013 13th International Conference on Computational Science and Its Applications, 2013, pp. 89–98, https://doi.org/10.1109/ICCSA.2013.22.
[18]
A. Tripathi, S. Dabral, A. Sureka, University-industry collaboration and open source software (OSS) dataset in mining software repositories (MSR) research, in: 2015 IEEE 1st International Workshop on Software Analytics (SWAN), 2015, pp. 39–40, https://doi.org/10.1109/SWAN.2015.7070489.
[19]
Hassan A.E., The road ahead for Mining Software repositories, in: 2008 Frontiers of Software Maintenance, 2008, pp. 48–57,.
[20]
Vial G., Reflections on quality requirements for digital trace data in IS research, Decis. Support Syst. 126 (2019),. URL: http://www.sciencedirect.com/science/article/pii/S0167923619301629.
[21]
Kitchenham B., Procedures for performing systematic reviews, Keele, UK, Keele University 33 (2004) 1–26.
[22]
Petersen K., Vakkalanka S., Kuzniarz L., Guidelines for conducting systematic mapping studies in software engineering: An update, Inf. Softw. Technol. 64 (2015) 1–18,. URL: http://www.sciencedirect.com/science/article/pii/S0950584915000646.
[23]
Shang W., Adams B., Hassan A.E., Using Pig as a data preparation language for large-scale mining software repositories studies: An experience report, J. Syst. Softw. 85 (2012) 2195–2204,. URL: http://www.sciencedirect.com/science/article/pii/S0164121211002007.
[24]
M. D’Ambros, R. Robbes, Effective mining of software repositories, in: 2011 27th IEEE International Conference on Software Maintenance (ICSM), 2011, pp. 598–598, https://doi.org/10.1109/ICSM.2011.6080839, ISSN: 1063-6773.
[25]
Garcia I., Pacheco C., Méndez F., Calvo-Manzano J.A., The effects of game-based learning in the acquisition of “soft skills” on undergraduate software engineering courses: A systematic literature review, Comput. Appl. Eng. Edu. 28 (2020) 1327–1354,.
[26]
Abuhamad M., Rhim J.-s., AbuHmed T., Ullah S., Kang S., Nyang D., Code authorship identification using convolutional neural networks, Future Gener. Comput. Syst. 95 (2019) 104–115,. URL: http://www.sciencedirect.com/science/article/pii/S0167739X18315528.
[27]
M.H. Asyrofi, F. Thung, D. Lo, L. Jiang, AUSearch: Accurate API usage search in GitHub repositories with type resolution, in: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2020, pp. 637–641, https://doi.org/10.1109/SANER48275.2020.9054809, ISSN: 1534-5351.
[28]
Bakar N.S.A.A., Using language-based search in mining large software repositories, Procedia - Soc. Behav. Sci. 27 (2011) 160–168,. URL: http://www.sciencedirect.com/science/article/pii/S1877042811024219.
[29]
Banerjee S., Syed Z., Helmick J., Culp M., Ryan K., Cukic B., Automated triaging of very large bug repositories, Inf. Softw. Technol. 89 (2017) 1–13,. URL: http://www.sciencedirect.com/science/article/pii/S0950584916301653.
[30]
Batista N.A., Brandão M.A., Alves G.B., da Silva A.P.C., Moro M.M., Collaboration strength metrics and analyses on GitHub, in: Proceedings of the International Conference on Web Intelligence, in: WI ’17, Association for Computing Machinery, Leipzig, Germany, 2017, pp. 170–178,.
[31]
Capiluppi A., Ajienka N., Lexical content as a cooperation aide: A study based on Java software, J. Syst. Softw. 164 (2020),. URL: http://www.sciencedirect.com/science/article/pii/S016412122030025X.
[32]
Chong C.Y., Lee S.P., Can commit change history reveal potential fault prone classes? A study on GitHub repositories, in: van Sinderen M., Maciaszek L.A. (Eds.), Software Technologies, in: Communications in Computer and Information Science, Springer International Publishing, Cham, 2019, pp. 266–281,.
[33]
Coelho J., Valente M.T., Silva L.L., Shihab E., Identifying unmaintained projects in github, in: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, in: ESEM ’18, Association for Computing Machinery, Oulu, Finland, 2018, pp. 1–10,.
[34]
E. Cohen, M.P. Consens, Large-scale analysis of the co-commit patterns of the active developers in GitHub’s top repositories, in: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), 2018, pp. 426–436, ISSN: 2574-3864.
[35]
Decan A., Constantinou E., Mens T., Rocha H., GAP: Forecasting commit activity in git projects, J. Syst. Softw. 165 (2020),. URL: http://www.sciencedirect.com/science/article/pii/S0164121220300546.
[36]
A. Decan, T. Mens, M. Claes, P. Grosjean, When GitHub meets CRAN: An analysis of inter-repository package dependency problems, in: 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol. 1, 2016, pp. 493–504, https://doi.org/10.1109/SANER.2016.12.
[37]
El Mezouar M., Zhang F., Zou Y., An empirical study on the teams structures in social coding using GitHub projects, Empir. Softw. Eng. 24 (2019) 3790–3823,.
[38]
G. Farah, D. Correal, Analysis of intercrossed open-source software repositories data in GitHub, in: 2013 8th Computing Colombian Conference (8CCC), 2013, pp. 1–6, https://doi.org/10.1109/ColombianCC.2013.6637537.
[39]
Gelman B., Obayomi B., Moore J., Slater D., Source code analysis dataset, Data in Brief 27 (2019),. URL: http://www.sciencedirect.com/science/article/pii/S2352340919310674.
[40]
Gupta M., Nirikshan: process mining software repositories to identify inefficiencies, imperfections, and enhance existing process capabilities, in: Companion Proceedings of the 36th International Conference on Software Engineering, in: ICSE Companion 2014, Association for Computing Machinery, Hyderabad, India, 2014, pp. 658–661,.
[41]
F. Hassan, X. Wang, Mining readme files to support automatic building of java projects in software repositories, in: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), 2017, pp. 277–279, https://doi.org/10.1109/ICSE-C.2017.114.
[42]
Higo Y., Hayashi S., Kusumoto S., On tracking Java methods with Git mechanisms, J. Syst. Softw. 165 (2020),. URL: http://www.sciencedirect.com/science/article/pii/S0164121220300522.
[43]
Härtel J., Heinz M., Lämmel R., EMF patterns of usage on GitHub, in: Pierantonio A., Trujillo S. (Eds.), Modelling Foundations and Applications, in: Lecture Notes in Computer Science, Springer International Publishing, Cham, 2018, pp. 216–234,.
[44]
S.D. Joshi, S. Chimalakonda, RapidRelease - A dataset of projects and issues on Github with rapid releases, in: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 2019, pp. 587–591, https://doi.org/10.1109/MSR.2019.00088, ISSN: 2574-3864.
[45]
Kawaguchi S., Garg P.K., Matsushita M., Inoue K., MUDABlue: An automatic categorization system for Open Source repositories, J. Syst. Softw. 79 (2006) 939–953,. URL: http://www.sciencedirect.com/science/article/pii/S0164121205001822.
[46]
I. Keivanloo, C. Forbes, A. Hmood, M. Erfani, C. Neal, G. Peristerakis, J. Rilling, A Linked Data platform for mining software repositories, in: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), 2012, pp. 32–35, https://doi.org/10.1109/MSR.2012.6224296, ISSN: 2160-1860.
[47]
Kiehn M., Pan X., Camci F., Empirical study in using version histories for change risk classification, in: Proceedings of the 16th International Conference on Mining Software Repositories, in: MSR ’19, IEEE Press, Montreal, Quebec, Canada, 2019, pp. 58–62,.
[48]
Kikas R., Dumas M., Pfahl D., Issue dynamics in Github projects, in: Abrahamsson P., Corral L., Oivo M., Russo B. (Eds.), Product-Focused Software Process Improvement, in: Lecture Notes in Computer Science, Springer International Publishing, Cham, 2015, pp. 295–310,.
[49]
Maqsood J., Eshraghi I., Ali S.S., Success or failure identification for GitHub’s open source projects, in: Proceedings of the 2017 International Conference on Management Engineering, Software Engineering and Service Sciences, in: ICMSS ’17, Association for Computing Machinery, Wuhan, China, 2017, pp. 145–150,.
[50]
Martinez M., Monperrus M., Mining software repair models for reasoning on the search space of automated program fixing, Empir. Softw. Eng. 20 (2015) 176–205,.
[51]
Munaiah N., Kroh S., Cabrey C., Nagappan M., Curating GitHub for engineered software projects, Empir. Softw. Eng. 22 (2017) 3219–3253,.
[52]
W. Muylaert, C. De Roover, Prevalence of botched code integrations, in: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), 2017, pp. 503–506, https://doi.org/10.1109/MSR.2017.40.
[53]
Nafi K.W., Roy B., Roy C.K., Schneider K.A., A universal cross language software similarity detector for open source software categorization, J. Syst. Softw. 162 (2020),. URL: http://www.sciencedirect.com/science/article/pii/S0164121219302651.
[54]
P.T. Nguyen, J. Di Rocco, R. Rubei, D. Di Ruscio, CrossSim: Exploiting mutual relationships to detect similar OSS projects, in: 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2018, pp. 388–395, https://doi.org/10.1109/SEAA.2018.00069.
[55]
Parashar A., Chhabra J.K., Mining software change data stream to predict changeability of classes of object-oriented software system, Evol. Syst. 7 (2016) 117–128,.
[56]
Rahman M.M., Roy C.K., An insight into the pull requests of GitHub, in: Proceedings of the 11th Working Conference on Mining Software Repositories, in: MSR 2014, Association for Computing Machinery, Hyderabad, India, 2014, pp. 364–367,.
[57]
Saied M.A., Ouni A., Sahraoui H., Kula R.G., Inoue K., Lo D., Improving reusability of software libraries through usage pattern mining, J. Syst. Softw. 145 (2018) 164–179,. URL: http://www.sciencedirect.com/science/article/pii/S0164121218301699.
[58]
Santos A., Souza M.c., Oliveira J., Figueiredo E., Mining software repositories to identify library experts, in: Proceedings of the VII Brazilian Symposium on Software Components, Architectures, and Reuse, in: SBCARS ’18, Association for Computing Machinery, Sao Carlos, Brazil, 2018, pp. 83–91,.
[59]
L.B.L. de Souza, M. de Almeida Maia, Do software categories impact coupling metrics? in: 2013 10th Working Conference on Mining Software Repositories (MSR), 2013, pp. 217–220, https://doi.org/10.1109/MSR.2013.6624030, ISSN: 2160-1860.
[60]
de la Torre G., Robbes R., Bergel A., Imprecisions diagnostic in source code deltas, in: Proceedings of the 15th International Conference on Mining Software Repositories, in: MSR ’18, Association for Computing Machinery, Gothenburg, Sweden, 2018, pp. 492–502,.
[61]
Vendome C., A large scale study of license usage on GitHub, in: Proceedings of the 37th International Conference on Software Engineering - Volume 2, in: ICSE ’15, IEEE Press, Florence, Italy, 2015, pp. 772–774.
[62]
M. White, C. Vendome, M. Linares-Vasquez, D. Poshyvanyk, Toward deep learning software repositories, in: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 2015, pp. 334–345, https://doi.org/10.1109/MSR.2015.38, ISSN: 2160-1860.
[63]
Yu Y., Li Z., Yin G., Wang T., Wang H., A dataset of duplicate pull-requests in github, in: Proceedings of the 15th International Conference on Mining Software Repositories, in: MSR ’18, Association for Computing Machinery, Gothenburg, Sweden, 2018, pp. 22–25,.
[64]
A. Zaidman, B. Van Rompaey, S. Demeyer, A. van Deursen, Mining software repositories to study co-evolution of production test code, in: And Validation 2008 1st International Conference on Software Testing, Verification, 2008, pp. 220–229, https://doi.org/10.1109/ICST.2008.47, ISSN: 2159-4848.
[65]
Y. Zhang, D. Lo, P.S. Kochhar, X. Xia, Q. Li, J. Sun, Detecting similar repositories on GitHub, in: 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2017, pp. 13–23, https://doi.org/10.1109/SANER.2017.7884605.
[66]
Zou W., Xuan J., Xie X., Chen Z., Xu B., How does code style inconsistency affect pull request integration? An exploratory study on 117 GitHub projects, Empir. Softw. Eng. 24 (2019) 3871–3903,.
[67]
R. Bana, A. Arora, Influence indexing of developers, repositories, technologies and programming languages on social coding community GitHub, in: 2018 Eleventh International Conference on Contemporary Computing (IC3), 2018, pp. 1–6, https://doi.org/10.1109/IC3.2018.8530644, ISSN: 2572-6129.
[68]
H. Borges, A. Hora, M.T. Valente, Understanding the factors that impact the popularity of GitHub repositories, in: 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2016, pp. 334–344, https://doi.org/10.1109/ICSME.2016.31.
[69]
Borges H., Tulio Valente M., What’s in a GitHub star? Understanding repository starring practices in a social coding platform, J. Syst. Softw. 146 (2018) 112–129,. URL: http://www.sciencedirect.com/science/article/pii/S0164121218301961.
[70]
Borle N.C., Feghhi M., Stroulia E., Greiner R., Hindle A., Analyzing the effects of test driven development in GitHub, Empir. Softw. Eng. 23 (2018) 1931–1958,.
[71]
F. Chatziasimidis, I. Stamelos, Data collection and analysis of GitHub repositories and users, in: 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA), 2015, pp. 1–6, https://doi.org/10.1109/IISA.2015.7388026.
[72]
Cito J., Schermann G., Wittern J.E., Leitner P., Zumberi S., Gall H.C., An empirical analysis of the docker container ecosystem on GitHub, in: Proceedings of the 14th International Conference on Mining Software Repositories, in: MSR ’17, IEEE Press, Buenos Aires, Argentina, 2017, pp. 323–333,.
[73]
Goyal A., Sardana N., Performance assessment of bug fixing process in open source repositories, Procedia Comput. Sci. 167 (2020) 2070–2079,. URL: http://www.sciencedirect.com/science/article/pii/S1877050920307134.
[74]
Guidotti R., Soldani J., Neri D., Brogi A., Explaining successful docker images using pattern mining analysis, in: Mazzara M., Ober I., Salaün G. (Eds.), Software Technologies: Applications and Foundations, in: Lecture Notes in Computer Science, Springer International Publishing, Cham, 2018, pp. 98–113,.
[75]
N. Hajiakhoond Bidoki, G. Sukthankar, H. Keathley, I. Garibay, A cross-repository model for predicting popularity in GitHub, in: 2018 International Conference on Computational Science and Computational Intelligence (CSCI), 2018, pp. 1248–1253, https://doi.org/10.1109/CSCI46756.2018.00241.
[76]
Jiang J., Lo D., He J., Xia X., Kochhar P.S., Zhang L., Why and how developers fork what from whom in GitHub, Empir. Softw. Eng. 22 (2017) 547–578,.
[77]
Kavaler D., Devanbu P., Filkov V., Whom are you going to call? determinants of @-mentions in Github discussions, Empir. Softw. Eng. 24 (2019) 3904–3932,.
[78]
Kikas R., Dumas M., Pfahl D., Using dynamic and contextual features to predict issue lifetime in GitHub projects, in: Proceedings of the 13th International Conference on Mining Software Repositories, in: MSR ’16, Association for Computing Machinery, Austin, Texas, 2016, pp. 291–302,.
[79]
Lee S., Baek H., Jahng J., Governance strategies for open collaboration: Focusing on resource allocation in open source software development organizations, Int. J. Inf. Manage. 37 (2017) 431–437,. URL: http://www.sciencedirect.com/science/article/pii/S0268401216308623.
[80]
N. Li, Z. Li, L. Zhang, Mining frequent patterns from software defect repositories for black-box testing, in: 2010 2nd International Workshop on Intelligent Systems and Applications, 2010, pp. 1–4, https://doi.org/10.1109/IWISA.2010.5473578.
[81]
Ozer M., Sapienza A., Abeliuk A., Muric G., Ferrara E., Discovering patterns of online popularity from time series, Expert Syst. Appl. 151 (2020),. URL: http://www.sciencedirect.com/science/article/pii/S0957417420301627.
[82]
Peng G., Co-membership, networks ties, and knowledge flow: An empirical investigation controlling for alternative mechanisms, Decis. Support Syst. 118 (2019) 83–90,. URL: http://www.sciencedirect.com/science/article/pii/S0167923619300132.
[83]
Y. Zhang, F.F. Xu, S. Li, Y. Meng, X. Wang, Q. Li, J. Han, HiGitClass: Keyword-driven hierarchical classification of GitHub repositories, in: 2019 IEEE International Conference on Data Mining (ICDM), 2019, pp. 876–885, https://doi.org/10.1109/ICDM.2019.00098, ISSN: 2374-8486.
[84]
Chatzidimitriou K.C., Papamichail M.D., Diamantopoulos T., Tsapanos M., Symeonidis A.L., Npm-miner: an infrastructure for measuring the quality of the npm registry, in: Proceedings of the 15th International Conference on Mining Software Repositories, in: MSR ’18, Association for Computing Machinery, Gothenburg, Sweden, 2018, pp. 42–45,.
[85]
Goeminne M., Mens T., A comparison of identity merge algorithms for software repositories, Sci. Comput. Programm. 78 (2013) 971–986,. URL: http://www.sciencedirect.com/science/article/pii/S0167642311002048.
[86]
Nguyen P.T., Di Rocco J., Rubei R., Di Ruscio D., An automated approach to assess the similarity of GitHub repositories, Softw. Qual. J. 28 (2020) 595–631,.
[87]
R. Souza, C. Chavez, Characterizing verification of bug fixes in two open source IDEs, in: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), 2012, pp. 70–73, https://doi.org/10.1109/MSR.2012.6224301, ISSN: 2160-1860.
[88]
N.M. Tiwari, G. Upadhyaya, H.A. Nguyen, H. Rajan, Candoia: A platform for building and sharing mining software repositories tools as apps, in: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), 2017, pp. 53–63, https://doi.org/10.1109/MSR.2017.56.
[89]
Zhou C., Li B., Sun X., Improving software bug-specific named entity recognition with deep neural network, J. Syst. Softw. 165 (2020),. URL: http://www.sciencedirect.com/science/article/pii/S0164121220300534.
[90]
Fu Y., Yan M., Zhang X., Xu L., Yang D., Kymer J.D., Automated classification of software change messages by semi-supervised latent Dirichlet allocation, Inf. Softw. Technol. 57 (2015) 369–377,. URL: http://www.sciencedirect.com/science/article/pii/S0950584914001347.
[91]
Mengerink J.G.M., Noten J., Serebrenik A., Empowering OCL research: a large-scale corpus of open-source data from GitHub, Empir. Softw. Eng. 24 (2019) 1574–1609,.
[92]
J. Noten, J.G. Mengerink, A. Serebrenik, A data set of OCL expressions on GitHub, in: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), 2017, pp. 531–534, https://doi.org/10.1109/MSR.2017.52.
[93]
Robles G., Ho-Quang T., Hebig R., Chaudron M.R.V., Fernandez M.A., An extensive dataset of UML models in GitHub, in: Proceedings of the 14th International Conference on Mining Software Repositories, in: MSR ’17, IEEE Press, Buenos Aires, Argentina, 2017, pp. 519–522,.
[94]
G. Schermann, S. Zumberi, J. Cito, Structured information on state and evolution of dockerfiles on GitHub, in: 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR), 2018, pp. 26–29, ISSN: 2574-3864.
[95]
Y. Yan, M. Menarini, W. Griswold, Mining software contracts for software evolution, in: 2014 IEEE International Conference on Software Maintenance and Evolution, 2014, pp. 471–475, https://doi.org/10.1109/ICSME.2014.76, ISSN: 1063-6773.
[96]
S. Brisson, E. Noei, K. Lyons, We are family: analyzing communication in GitHub software repositories and their forks, in: 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2020, pp. 59–69, https://doi.org/10.1109/SANER48275.2020.9054834, ISSN: 1534-5351.
[97]
H. Xia, C. Li, M. Shi, Design of repositories of GitHub recommendation system based on ternary closure and HITS algorithm, in: 2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS), 2019, pp. 1–5, https://doi.org/10.1109/ICIS46139.2019.8940236.
[98]
M. Goeminne, M. Claes, T. Mens, A historical dataset for the Gnome ecosystem, in: 2013 10th Working Conference on Mining Software Repositories (MSR), 2013, pp. 225–228, https://doi.org/10.1109/MSR.2013.6624032, ISSN: 2160-1860.
[99]
Ohira M., Kashiwa Y., Yamatani Y., Yoshiyuki H., Maeda Y., Limsettho N., Fujino K., Hata H., Ihara A., Matsumoto K., A dataset of high impact bugs: manually-classified issue reports, in: Proceedings of the 12th Working Conference on Mining Software Repositories, in: MSR ’15, IEEE Press, Florence, Italy, 2015, pp. 518–521.
[100]
J.C.S. Santos, M. Mirakhorli, I. Mujhid, W. Zogaan, BUDGET: A tool for supporting software architecture traceability research, in: 2016 13th Working IEEE/IFIP Conference on Software Architecture (WICSA), 2016, pp. 303–306, https://doi.org/10.1109/WICSA.2016.47.
[101]
A. Trockman, R. van Tonder, B. Vasilescu, Striking gold in software repositories? An econometric study of cryptocurrencies on GitHub, in: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 2019, pp. 181–185, https://doi.org/10.1109/MSR.2019.00036, ISSN: 2574-3864.
[102]
Chen H., Huang Y., Liu Z., Chen X., Zhou F., Luo X., Automatically detecting the scopes of source code comments, J. Syst. Softw. 153 (2019) 45–63,. URL: http://www.sciencedirect.com/science/article/pii/S016412121930055X.
[103]
Petticrew M., Roberts H., Systematic Reviews in the Social Sciences: A Practical Guide, John Wiley & Sons, 2008.
[104]
Ramachandran M., Software reuse guidelines, SIGSOFT Softw. Eng. Notes 30 (2005) 1–8,.
[105]
Ramachandran M., Guidelines based software engineering for developing software components, J. Softw. Eng. Appl. 05 (2012) 1–6,.
[106]
Kolovos D.S., Matragkas N.D., Korkontzelos I., Ananiadou S., Paige R.F., Assessing the use of eclipse MDE technologies in open-source software projects, in: OSS4MDE@MoDELS, 2015, pp. 1–10.
[107]
A. Howard, C. Zhang, E. Horvitz, Addressing bias in machine learning algorithms: A pilot study on emotion recognition for intelligent systems, in: 2017 IEEE Workshop on Advanced Robotics and Its Social Impacts (ARSO), 2017, pp. 1–7.
[108]
Kristiansen T.B., Erroneous data and drug industry bias can impair machine learning algorithms, BMJ 367 (2019),. URL: https://www.bmj.com/content/367/bmj.l6042.
[109]
G. Gousios, D. Spinellis, Mining software engineering data from GitHub, in: 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), 2017, pp. 501–502.
[110]
E. Mendes, K. Felizardo, C. Wohlin, M. Kalinowski, Search strategy to update systematic literature reviews in software engineering, in: 2019 45th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2019, pp. 355–362, https://doi.org/10.1109/SEAA.2019.00061.
[111]
N.K. Nagwani, A. Bhansali, A data mining model to predict software bug complexity using bug estimation and clustering, in: 2010 International Conference on Recent Trends in Information, Telecommunication and Computing, 2010, pp. 13–17, https://doi.org/10.1109/ITC.2010.56.
[112]
van Tonder R., Trockman A., Goues C.L., A panel data set of cryptocurrency development activity on GitHub, in: Proceedings of the 16th International Conference on Mining Software Repositories, in: MSR ’19, IEEE Press, Montreal, Quebec, Canada, 2019, pp. 186–190,.
[113]
E. Kouroshfar, M. Mirakhorli, H. Bagheri, L. Xiao, S. Malek, Y. Cai, A study on the role of software architecture in the evolution and quality of software, in: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 2015, pp. 246–257, https://doi.org/10.1109/MSR.2015.30, ISSN: 2160-1860.
[114]
. Neelofar, M.Y. Javed, H. Mohsin, An automated approach for software bug classification, in: 2012 Sixth International Conference on Complex, Intelligent, and Software Intensive Systems, 2012, pp. 414–419, https://doi.org/10.1109/CISIS.2012.132.
[115]
Raja U., Tretter M.J., Antecedents of open source software defects: A data mining approach to model formulation, validation and testing, Inform. Technol. Manag. 10 (2009) 235,.
[116]
M. Harman, Y. Jia, Y. Zhang, App store mining and analysis: MSR for app stores, in: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), 2012, pp. 108–111, https://doi.org/10.1109/MSR.2012.6224306, ISSN: 2160-1860.
[117]
Prakash B.V.A., Ashoka D.V., Aradhya V.N.M., Application of data mining techniques for software reuse process, Proc. Technol. 4 (2012) 384–389,. URL: http://www.sciencedirect.com/science/article/pii/S2212017312003386.
[118]
Costa F., de Oliveira D., Ogasawara E., Lima A.A.B., Mattoso M., Athena: Text mining based discovery of scientific workflows in disperse repositories, in: Lacroix Z., Vidal M.a.E. (Eds.), Resource Discovery, in: Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2012, pp. 104–121,.
[119]
Ampatzoglou A., Michou O., Stamelos I., Building and mining a repository of design pattern instances: Practical and research benefits, Entertain. Comput. 4 (2013) 131–142,. URL: http://www.sciencedirect.com/science/article/pii/S1875952112000195.
[120]
Arcelli Fontana F., Rolla M., Zanoni M., Capturing software evolution and change through code repository smells, in: Dingsøyr T., Moe N.B., Tonelli R., Counsell S., Gencel C., Petersen K. (Eds.), Agile Methods. Large-Scale Development, Refactoring, Testing, and Estimation, in: Lecture Notes in Business Information Processing, Springer International Publishing, Cham, 2014, pp. 148–165,.
[121]
Prana G.A.A., Treude C., Thung F., Atapattu T., Lo D., Categorizing the content of GitHub README files, Empir. Softw. Eng. 24 (2019) 1296–1327,.
[122]
Soll M., Vosgerau M., ClassifyHub: An algorithm to classify GitHub repositories, in: Kern-Isberner G., Fürnkranz J., Thimm M. (Eds.), KI 2017: Advances in Artificial Intelligence, in: Lecture Notes in Computer Science, Springer International Publishing, Cham, 2017, pp. 373–379,.
[123]
Kim S., Whitehead E.J., Zhang Y., Classifying software changes: Clean or Buggy?, IEEE Trans. Softw. Eng. 34 (2008) 181–196,.
[124]
Sicilia M.-A., Garcí a Barriocanal E., Sánchez-Alonso S., Community curation in open dataset repositories: Insights from Zenodo, Procedia Comput. Sci. 106 (2017) 54–60,. URL: http://www.sciencedirect.com/science/article/pii/S1877050917302776.
[125]
L. Madeyski, M. Kawalerowicz, Continuous defect prediction: The idea and a related dataset, in: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), 2017, pp. 515–518, https://doi.org/10.1109/MSR.2017.46.
[126]
D. Kolovos, P. Neubauer, K. Barmpis, N. Matragkas, R. Paige, Crossflow: A framework for distributed mining of software repositories, in: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 2019, pp. 155–159, https://doi.org/10.1109/MSR.2019.00032, ISSN: 2574-3864.
[127]
M. Kumar J., S. Dubey, B. Balaji, D. Rao, D. Rao, Data visualization on GitHub repository parameters using elastic search and Kibana, in: 2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI), 2018, pp. 554–558, https://doi.org/10.1109/ICOEI.2018.8553755.
[128]
Selby R., Enabling reuse-based software development of large-scale systems, IEEE Trans. Softw. Eng. 31 (2005) 495–510,.
[129]
G. Canfora, L. Cerulo, Fine grained indexing of software repositories to support impact analysis, in: Proceedings of the 2006 International Workshop on Mining Software Repositories, (MSR ’06), Association for Computing Machinery, Shanghai, China, 2006, pp. 105–111, https://doi.org/10.1145/1137983.1138009.
[130]
Vasilescu B., Posnett D., Ray B., van den Brand M.G., Serebrenik A., Devanbu P., Filkov V., Gender and tenure diversity in GitHub teams, in: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, in: CHI ’15, Association for Computing Machinery, Seoul, Republic of Korea, 2015, pp. 3789–3798,.
[131]
Lazar A., Ritchey S., Sharif B., Generating duplicate bug datasets, in: Proceedings of the 11th Working Conference on Mining Software Repositories, in: MSR 2014, Association for Computing Machinery, Hyderabad, India, 2014, pp. 392–395,.
[132]
Lee R.K.-W., Lo D., GitHub and stack overflow: Analyzing developer interests across multiple social collaborative platforms, in: Ciampaglia G.L., Mashhadi A., Yasseri T. (Eds.), Social Informatics, in: Lecture Notes in Computer Science, Springer International Publishing, Cham, 2017, pp. 245–256,.
[133]
X. Cai, J. Zhu, B. Shen, Y. Chen, GRETA: Graph-based tag assignment for GitHub repositories, in: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol. 1, 2016, pp. 63–72, https://doi.org/10.1109/COMPSAC.2016.124, ISSN: 0730-3157.
[134]
S.S. Manes, O. Baysal, How often and what StackOverflow posts do developers reference in their GitHub projects? in: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 2019, pp. 235–239, https://doi.org/10.1109/MSR.2019.00047, ISSN: 2574-3864.
[135]
C.A. Thompson, G.C. Murphy, M. Palyart, M. Ga CČsparic, How software developers use work breakdown relationships in issue repositories, in: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), 2016, pp. 281–285.
[136]
F. Mulder, A. Zaidman, Identifying cross-cutting concerns using software repository mining, in: Proceedings of the Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE), IWPSE-EVOL ’10, Association for Computing Machinery, Antwerp, Belgium, 2010, pp. 23–32, https://doi.org/10.1145/1862372.1862381.
[137]
Montandon J.a.E., Silva L.L., Valente M.T., Identifying experts in software libraries and frameworks among GitHub users, in: Proceedings of the 16th International Conference on Mining Software Repositories, in: MSR ’19, IEEE Press, Montreal, Quebec, Canada, 2019, pp. 276–287,.
[138]
J. Hayashi, Y. Higo, S. Matsumoto, S. Kusumoto, Impacts of daylight saving time on software development, in: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 2019, pp. 502–506, https://doi.org/10.1109/MSR.2019.00076, ISSN: 2574-3864.
[139]
Hu Y., Zhang J., Bai X., Yu S., Yang Z., Influence analysis of Github repositories, SpringerPlus 5 (2016) 1268,.
[140]
Hauff C., Gousios G., Matching GitHub developer profiles to job advertisements, in: Proceedings of the 12th Working Conference on Mining Software Repositories, in: MSR ’15, IEEE Press, Florence, Italy, 2015, pp. 362–366.
[141]
A.S. Badashian, E. Stroulia, Measuring user influence in GitHub: the million follower fallacy, in: Proceedings of the 3rd International Workshop on CrowdSourcing in Software Engineering, (CSI-SE ’16), Association for Computing Machinery, Austin, Texas, 2016, pp. 15–21, https://doi.org/10.1145/2897659.2897663.
[142]
Yu Y., Wang H., Yin G., Liu B., Mining and recommending software features across multiple web repositories, in: Proceedings of the 5th Asia-Pacific Symposium on Internetware, in: Internetware ’13, Association for Computing Machinery, Changsha, China, 2013, pp. 1–9,.
[143]
Heinze T.S., Stefanko V., Amme W., Mining BPMN processes on GitHub for tool validation and development, in: Nurcan S., Reinhartz-Berger I., Soffer P., Zdravkovic J. (Eds.), Enterprise, Business-Process and Information Systems Modeling, in: Lecture Notes in Business Information Processing, Springer International Publishing, Cham, 2020, pp. 193–208,.
[144]
P. Abate, R. Di Cosmo, L. Gesbert, F. Le Fessant, R. Treinen, S. Zacchiroli, Mining component repositories for installability issues, in: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 2015, pp. 24–33, https://doi.org/10.1109/MSR.2015.10, ISSN: 2160-1860.
[145]
L. Yu, S. Ramaswamy, Mining CVS repositories to understand open-source project developer roles, in: Fourth International Workshop on Mining Software Repositories (MSR’07:ICSE Workshops 2007), 2007, pp. 8, https://doi.org/10.1109/MSR.2007.19, ISSN: 2160-1860.
[146]
Sprint G., Conci J., Mining GitHub classroom commit behavior in elective and introductory computer science courses, J. Comput. Sci. Colleges 35 (2019) 76–84.
[147]
Y. Weicheng, S. Beijun, X. Ben, Mining GitHub: Why commit stops – Exploring the relationship between developer’s commit pattern and file version evolution, in: 2013 20th Asia-Pacific Software Engineering Conference (APSEC), vol. 2, 2013, pp. 165–169, https://doi.org/10.1109/APSEC.2013.133, ISSN: 1530-1362.
[148]
S. Yatish, J. Jiarpakdee, P. Thongtanunam, C. Tantithamthavorn, Mining software defects: Should we consider affected releases? in: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 2019, pp. 654–665, https://doi.org/10.1109/ICSE.2019.00075, ISSN: 1558-1225.
[149]
T. Wang, H. Wang, G. Yin, C.X. Ling, X. Li, P. Zou, Mining software profile across multiple repositories for hierarchical categorization, in: 2013 IEEE International Conference on Software Maintenance, 2013, pp. 240–249, https://doi.org/10.1109/ICSM.2013.35, ISSN: 1063-6773.
[150]
X. Meng, B.P. Miller, W.R. Williams, A.R. Bernat, Mining software repositories for accurate authorship, in: 2013 IEEE International Conference on Software Maintenance, 2013, pp. 250–259, https://doi.org/10.1109/ICSM.2013.36, ISSN: 1063-6773.
[151]
Meqdadi O., Alhindawi N., Alsakran J., Saifan A., Migdadi H., Mining software repositories for adaptive change commits using machine learning techniques, Inf. Softw. Technol. 109 (2019) 80–91,. URL: http://www.sciencedirect.com/science/article/pii/S0950584919300084.
[152]
Vandecruys O., Martens D., Baesens B., Mues C., De Backer M., Haesen R., Mining software repositories for comprehensible software fault prediction models, J. Syst. Softw. 81 (2008) 823–839,. URL: http://www.sciencedirect.com/science/article/pii/S0164121207001902.
[153]
H.K. Dam, B.T.R. Savarimuthu, D. Avery, A. Ghose, Mining software repositories for social norms, in: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 2, 2015, pp. 627–630, https://doi.org/10.1109/ICSE.2015.209, ISSN: 1558-1225.
[154]
K. Mierle, K. Laven, S. Roweis, G. Wilson, Mining student CVS repositories for performance indicators, in: Proceedings of the 2005 International Workshop on Mining Software Repositories, MSR ’05, Association for Computing Machinery, St. Louis, Missouri, 2005, pp. 1–5, https://doi.org/10.1145/1083142.1083150.
[155]
J. Wang, Y. Dang, H. Zhang, K. Chen, T. Xie, D. Zhang, Mining succinct and high-coverage API usage patterns from source code, in: 2013 10th Working Conference on Mining Software Repositories (MSR), 2013, pp. 319–328, https://doi.org/10.1109/MSR.2013.6624045, ISSN: 2160-1860.
[156]
X. Yang, R.G. Kula, N. Yoshida, H. Iida, Mining the modern code review repositories: A dataset of people, process and product, in: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), 2016, pp. 460–463.
[157]
Ma Y., Li H., Hu J., Xie R., Chen Y., Mining the network of the programmers: A data-driven analysis of GitHub, in: Proceedings of the 12th Chinese Conference on Computer Supported Cooperative Work and Social Computing, in: ChineseCSCW ’17, Association for Computing Machinery, Chongqing, China, 2017, pp. 165–168,.
[158]
Bidoki N.H., Schiappa M., Sukthankar G., Garibay I., Modeling social coding dynamics with sampled historical data, Online Soc. Netw. Media 16 (2020),. URL: http://www.sciencedirect.com/science/article/pii/S2468696420300112.
[159]
Sun X., Li B., Leung H., Li B., Li Y., MSR4SM: Using topic models to effectively mining software repositories for software maintenance tasks, Inf. Softw. Technol. 66 (2015) 1–12,. URL: http://www.sciencedirect.com/science/article/pii/S0950584915001007.
[160]
G. Destefanis, M. Ortu, D. Bowes, M. Marchesi, R. Tonelli, On measuring affects of github issues’ commenters, in: Proceedings of the 3rd International Workshop on Emotion Awareness in Software Engineering, SEmotion ’18, Association for Computing Machinery, Gothenburg, Sweden, 2018, pp. 14–19, https://doi.org/10.1145/3194932.3194936.
[161]
P. Anbalagan, M. Vouk, On mining data across software repositories, in: 2009 6th IEEE International Working Conference on Mining Software Repositories, 2009, pp. 171–174, https://doi.org/10.1109/MSR.2009.5069498, ISSN: 2160-1860.
[162]
K.V.R. Paixão, C.c.Z. Felí cio, F.M. Delfim, M. De A. Maia, On the interplay between non-functional requirements and builds on continuous integration, in: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), 2017, pp. 479–482, https://doi.org/10.1109/MSR.2017.33.
[163]
Vale G., Schmid A., Santos A.R., de Almeida E.S., Apel S., On the relation between Github communication activity and merge conflicts, Empir. Softw. Eng. 25 (2020) 402–433,.
[164]
Zhang T., Yang G., Lee B., Chan A.T.S., Predicting severity of bug report by mining bug repository with concept profile, in: Proceedings of the 30th Annual ACM Symposium on Applied Computing, in: SAC ’15, Association for Computing Machinery, Salamanca, Spain, 2015, pp. 1553–1558,.
[165]
T.G. Habing, J. Eke, J.S. Kaczmarek, Repository software evaluation using the audit checklist for certification of trusted digital repositories, in: Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL ’06), 2006, pp. 107–108, https://doi.org/10.1145/1141753.1141774.
[166]
Raemaekers S., van Deursen A., Visser J., Semantic versioning and impact of breaking changes in the Maven repository, J. Syst. Softw. 129 (2017) 140–158,. URL: http://www.sciencedirect.com/science/article/pii/S0164121216300243.
[167]
Leibzon W., Social network of software development at GitHub, in: Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, in: ASONAM ’16, IEEE Press, Davis, California, 2016, pp. 1374–1376.
[168]
Czibula G., Marian Z., Czibula I.G., Software defect prediction using relational association rule mining, Inform. Sci. 264 (2014) 260–278,. URL: http://www.sciencedirect.com/science/article/pii/S0020025513008876.
[169]
Dwivedi A.K., Tirkey A., Rath S.K., Software design pattern mining using classification-based techniques, Front. Comput. Sci. 12 (2018) 908–922,.
[170]
Linstead E., Bajracharya S., Ngo T., Rigor P., Lopes C., Baldi P., Sourcerer: mining and searching internet-scale software repositories, Data Min. Knowl. Discov. 18 (2009) 300–336,.
[171]
O. Mizuno, S. Ikami, S. Nakaichi, T. Kikuno, Spam filter based approach for finding fault-prone software modules, in: Fourth International Workshop on Mining Software Repositories (MSR’07:ICSE Workshops 2007), 2007, pp. 4, https://doi.org/10.1109/MSR.2007.29, ISSN: 2160-1860.
[172]
M. Ortu, A. Murgia, G. Destefanis, P. Tourani, R. Tonelli, M. Marchesi, B. Adams, The emotional side of software developers in JIRA, in: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), 2016, pp. 480–483.
[173]
A. Lamkanfi, J. Pérez, S. Demeyer, The Eclipse and Mozilla defect tracking dataset: A genuine dataset for mining bug information, in: 2013 10th Working Conference on Mining Software Repositories (MSR), 2013, pp. 203–206, https://doi.org/10.1109/MSR.2013.6624028, ISSN: 2160-1860.
[174]
S. Raemaekers, A. van Deursen, J. Visser, The Maven repository dataset of metrics, changes, and dependencies, in: 2013 10th Working Conference on Mining Software Repositories (MSR), 2013, pp. 221–224, https://doi.org/10.1109/MSR.2013.6624031, ISSN: 2160-1860.
[175]
Alqahtani S.S., Eghan E.E., Rilling J., Tracing known security vulnerabilities in software repositories – A Semantic Web enabled modeling approach, Sci. Comput. Programm. 121 (2016) 153–175,. URL: http://www.sciencedirect.com/science/article/pii/S0167642316000253.
[176]
I. Neamtiu, J.S. Foster, M. Hicks, Understanding source code evolution using abstract syntax tree matching, in: Proceedings of the 2005 International Workshop on Mining Software Repositories, MSR ’05, Association for Computing Machinery, St. Louis, Missouri, 2005, pp. 1–5, https://doi.org/10.1145/1083142.1083143.
[177]
D.M. German, Using software distributions to understand the relationship among free and open source software projects, in: Proceedings of the Fourth International Workshop on Mining Software Repositories, MSR ’07, IEEE Computer Society, USA, 2007, pp. 24, https://doi.org/10.1109/MSR.2007.32.
[178]
P. Weissgerber, M. Pohl, M. Burch, Visual data mining in software archives to detect how developers work together, in: Proceedings of the Fourth International Workshop on Mining Software Repositories, MSR ’07, IEEE Computer Society, USA, 2007, pp. 9, https://doi.org/10.1109/MSR.2007.34.

Cited By

View all
  • (2024)Unsupervised and Supervised Co-learning for Comment-based Codebase Refining and its Application in Code SearchProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686664(1-12)Online publication date: 24-Oct-2024
  • (2024)On the Anatomy of Real-World R Code for Static AnalysisProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644911(619-630)Online publication date: 15-Apr-2024
  • (2024)Lessons Learned from Mining the Hugging Face RepositoryProceedings of the 1st IEEE/ACM International Workshop on Methodological Issues with Empirical Studies in Software Engineering10.1145/3643664.3648204(1-6)Online publication date: 16-Apr-2024
  • Show More Cited By

Index Terms

  1. A systematic process for Mining Software Repositories: Results from a systematic literature review
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Information and Software Technology
        Information and Software Technology  Volume 144, Issue C
        Apr 2022
        211 pages

        Publisher

        Butterworth-Heinemann

        United States

        Publication History

        Published: 01 April 2022

        Author Tags

        1. Mining Software Repositories
        2. Systematic literature review
        3. Evidence-based software engineering
        4. Guidelines

        Qualifiers

        • Review-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 13 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Unsupervised and Supervised Co-learning for Comment-based Codebase Refining and its Application in Code SearchProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686664(1-12)Online publication date: 24-Oct-2024
        • (2024)On the Anatomy of Real-World R Code for Static AnalysisProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644911(619-630)Online publication date: 15-Apr-2024
        • (2024)Lessons Learned from Mining the Hugging Face RepositoryProceedings of the 1st IEEE/ACM International Workshop on Methodological Issues with Empirical Studies in Software Engineering10.1145/3643664.3648204(1-6)Online publication date: 16-Apr-2024
        • (2023)Insights into software development approaches: mining Q &A repositoriesEmpirical Software Engineering10.1007/s10664-023-10417-529:1Online publication date: 23-Nov-2023
        • (2022)Characterizing Commits in Open-Source SoftwareProceedings of the XXI Brazilian Symposium on Software Quality10.1145/3571473.3571508(1-10)Online publication date: 7-Nov-2022
        • (2022)Analysis of Microservice Evolution using Cohesion MetricsProceedings of the 16th Brazilian Symposium on Software Components, Architectures, and Reuse10.1145/3559712.3559716(40-49)Online publication date: 3-Oct-2022

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media