Integrated Visual Software Analytics on the GitHub Platform
<p>A 2.5D interactive software map visualization of the Microsoft vscode software project. The number of lines of code (LoC) is mapped to weight, the number of functions (NoF) is mapped to height, and the density of comments (DoC) is mapped to color, ranging from inconspicuous (blue) to conspicuous (red).</p> "> Figure 2
<p>Process overview showing the participation of different actors through our data-processing pipeline triggered by a new commit. After processing, a visualization component can query the resulting software analytics data and derive visualization artifacts, such as software maps.</p> "> Figure 3
<p>Proposed data structure to save commit-based metadata in the git object database. Each commit with software data references the original commit through name matching.</p> "> Figure 4
<p>A screenshot of the prototypical client, showing the TensorFlow.js project. The number of lines of code (LoC) is mapped to weight, the number of functions (NoF) is mapped to height, and the density of comments (DoC) is mapped to color, ranging from inconspicuous (blue) to conspicuous (red).</p> "> Figure 5
<p>HTML script tag that loads the client and initializes the visualization with the given GitHub project and commit.</p> "> Figure 6
<p>Excerpt comparison of TypeScript projects with increasing size and complexity using a software map visualization. The number of lines of code (LoC) is mapped to weight, the number of functions (NoF) is mapped to height, and the density of comments (DoC) is mapped to color, ranging from inconspicuous (blue) to conspicuous (red). The full overview is provided in <a href="#computers-13-00033-f0A1" class="html-fig">Figure A1</a> and <a href="#computers-13-00033-f0A2" class="html-fig">Figure A2</a>.</p> "> Figure 7
<p>Emory impact of the metric file blob in kB on the repository per commit when measured by number of files (log–log axis). Color represents the number of lines of code as a second visual indicator of correlation. A derived linear regression (gray line) suggests that each file in the repository contributes approximately one kB of base64-encoded metric blob storage per commit.</p> "> Figure 8
<p>Extrapolated repository size impact if every commit of the main branch was augmented with software metrics information, measured by base repository size (log–log axis). Color represents the per-commit metric blob size as a second visual indicator. A derived linear regression (gray line) suggests that a repository would increase its size by 1.3-fold, i.e., the final size would have factor of 2.3. However, the spread is rather high and corresponds to the number of commits on the main branch of a repository.</p> "> Figure 9
<p>Run-time performance impact of the proposed software analysis component, measured by lines of code (log–log axis). Color represents the number of files as a second visual indicator that the analysis correlates with number of files as well. A derived linear regression (gray line) suggests that the analysis component does not scale linearly with the project size.</p> "> Figure 10
<p>Run-time performance of the full GitHub Action that includes the proposed software analysis component and metrics blob storage, measured by lines of code (log–log axis). Color represents the number of files as a second visual indicator that the analysis correlates with number of files as well. A derived linear regression (gray line) suggests that the analysis component does not scale linearly with the project size.</p> "> Figure A1
<p>Comparison of the first half of TypeScript projects with increasing size and complexity using a software map visualization. The number of lines of code (LoC) is mapped to weight, the number of functions (NoF) is mapped to height, and the density of comments (DoC) is mapped to color, ranging from inconspicuous (blue) to conspicuous (red). Continuation in <a href="#computers-13-00033-f0A2" class="html-fig">Figure A2</a>.</p> "> Figure A2
<p>Comparison of the second half of of TypeScript projects with increasing size and complexity using a software map visualization. The number of lines of code (LoC) is mapped to weight, the number of functions (NoF) is mapped to height, and the density of comments (DoC) is mapped to color, ranging from inconspicuous (blue) to conspicuous (red). Continuation from <a href="#computers-13-00033-f0A1" class="html-fig">Figure A1</a>.</p> ">
Abstract
:1. Introduction
- Readily available software analytics tools are often operated as external services;
- Measured software analysis data are kept internally;
- No external use of the data is available.
2. Related Work
2.1. Tools for Mining Software Repositories
2.2. Metric Storage Formats
2.3. Software Visualization
2.4. Software Analytics Systems
3. Approach
3.1. Process Overview
3.2. Analysis
3.3. Storage
3.4. Visualization
3.5. Prototype Implementation Details
- Lines of Code (LoC);
- Number of Comments (NoC);
- Comment Lines of Code (CLoC);
- Density of Comments (DoC);
- Number of Functions (NoF).
4. Evaluation
4.1. Case Study
4.2. Repository Memory Impact
4.3. CI Execution Time Impact
4.4. Practical Considerations and Recommendations
5. Discussion
5.1. Threats to Validity
5.1.1. Runtime Analysis
5.1.2. Storage Consumption Analysis
5.2. Limitations
5.2.1. Scalability
5.2.2. Advanced Git Workflows
5.2.3. Security Considerations
5.2.4. Extensibility
5.2.5. Modes of Integration into Development Process
5.2.6. Supported Programming Languages
5.2.7. Supported Metrics
5.2.8. Visualization Approaches
5.2.9. Stored Artifacts
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Project | Location | Branch | # Commits | % TS | # Files | # LoC |
---|---|---|---|---|---|---|
AFFiNE | toevery thing/AFFiNE | canary | 5012 | 98.1% | 705 | 58,822 |
Angular | angular/angular | main | 28,924 | 84.5% | 6438 | 762,820 |
Angular CLI | angular/angular-cli | main | 14,499 | 94.6% | 1074 | 138,552 |
Angular Components | angular/components | main | 11,413 | 81.0% | 2074 | 269,875 |
Ant Design | ant-design/ant-design | master | 26,917 | 99.2% | 822 | 53,436 |
Apollo Client | apollo-client | main | 12,105 | 98.4% | 313 | 97,443 |
Babylon.js | BabylonJS/Babylon.js | master | 42,282 | 88.2% | 1829 | 447,296 |
Bun | oven-sh/bun | main | 8399 | 5.4% | 607 | 188,673 |
cheerio | cheeriojs/cheerio | main | 2905 | 74.2% | 35 | 13,074 |
Definitely Typed | Definitely Typed/Definitely Typed | master | 85,867 | 99.9% | 34,067 | 6,769,450 |
Deno | denoland/deno | main | 10,516 | 22.2% | 1386 | 197,437 |
Electron | electron/electron | main | 27,898 | 31.1% | 195 | 54,764 |
Electron React Boilerplate | electron-react-boilerplate/electron- react-boilerplate | main | 1122 | 81.3% | 6 | 520 |
esbuild | evanw/esbuild | main | 4026 | 4.0% | 19 | 6576 |
eslint-plugin-import | import-js/eslint-plugin-import | main | 2203 | 0.2% | 50 | 347 |
Formly | ngx-formly/ngx-formly | main | 1790 | 98.8% | 608 | 31,366 |
freeCodeCamp.org’s open-source codebase and curriculum | freeCodeCamp/freeCodeCamp | main | 34,553 | 64.1% | 390 | 33,026 |
github-software-analytics-embedding | hpicgs/github-software-analytics- embedding | dev | 164 | 1.6% | 11 | 748 |
GraphQL Code Generator | dotansimha/graphql-code-generator | master | 8130 | 83.4% | 437 | 83,693 |
Hoppscotch | hoppscotch/hoppscotch | main | 5127 | 61.5% | 587 | 75,922 |
Hydrogen | nteract/hydrogen | master | 2372 | 68.7% | 36 | 5685 |
ice.js | alibaba/ice | master | 3067 | 83.4% | 503 | 33,575 |
Ionic | ionic-team/ionic-framework | main | 13,427 | 56.2% | 1034 | 89,790 |
Joplin | laurent22/joplin | dev | 10,687 | 66.5% | 1795 | 190,253 |
mean stack | linnovate/mean | master | 2232 | 51.3% | 33 | 868 |
Mermaid | mermaid-js/mermaid | develop | 9152 | 30.6% | 175 | 23,159 |
Mitosis | BuilderIO/mitosis | main | 1514 | 98.3% | 420 | 45,541 |
Monaco Editor | microsoft/monaco-editor | main | 3327 | 36.4% | 329 | 123,664 |
MUI Core | mui/material-ui | master | 23,644 | 55.9% | 1646 | 95,283 |
Nativefier | nativefier/nativefier | master | 1288 | 87.5% | 62 | 9289 |
NativeScript | NativeScript/NativeScript | main | 7345 | 85.9% | 1200 | 3,226,971 |
NativeScript Angular | NativeScript/nativescript-angular | master | 1867 | 92.0% | 385 | 21,038 |
Project | Location | Branch | # Commits | % TS | # Files | # LoC |
---|---|---|---|---|---|---|
NativeScript Command-Line Interface | NativeScript/nativescript-cli | main | 6470 | 26.7% | 515 | 110,724 |
NativeScript-Vue | nativescript-vue/nativescript-vue | main | 72 | 79.2% | 32 | 2197 |
NgRx | ngrx/platform | main | 1906 | 87.3% | 1230 | 136,981 |
ngx-admin | akveo/ngx-admin | master | 554 | 67.2% | 242 | 14,329 |
Noodle | noodle-run/noodle | main | 651 | 55.1% | 34 | 1494 |
Nuxt | nuxt/nuxt | main | 5242 | 98.4% | 404 | 30,741 |
Nx | nrwl/nx | master | 11,218 | 96.7% | 2975 | 406,848 |
Prettier | prettier/prettier | main | 9026 | 5.8% | 557 | 10,345 |
Prisma | prisma/prisma | main | 10,256 | 98.2% | 1702 | 147,821 |
Quasar Framework | quasarframework/quasar | dev | 13,575 | 0.3% | 300 | 66,316 |
React | facebook/react | main | 16,135 | 0.5% | 7 | 895 |
RealWorld | gothinkster/realworld | main | 949 | 86.8% | 104 | 6549 |
Rush Stack | microsoft/rushstack | main | 19,801 | 96.0% | 1315 | 167,304 |
RxDB | pubkey/rxdb | master | 10,244 | 96.0% | 558 | 79,486 |
SheetJS | SheetJS/sheetjs | github | 770 | 12.3% | 52 | 12,644 |
Slidev | slidevjs/slidev | main | 1560 | 66.6% | 101 | 9127 |
Socket.IO | socketio/socket.io | main | 2008 | 66.2% | 55 | 10,796 |
Storybook | storybookjs/storybook | next | 56,100 | 69.1% | 1496 | 154,002 |
Strapi Community Edition | strapi/strapi | develop | 33,413 | 73.6% | 1835 | 174,912 |
TensorFlow.js | tensorflow/tfjs | master | 6076 | 80.3% | 2532 | 330,668 |
themer | themerdev/themer | main | 1732 | 98.6% | 74 | 9537 |
TOAST UI Editor | nhn/tui.editor | main | 362 | 85.8% | 315 | 46,744 |
Turbo | vercel/turbo | main | 5842 | 8.2% | 359 | 27,344 |
TypeORM | typeorm/typeorm | master | 5361 | 99.8% | 3150 | 266,117 |
TypeScript RPC | k8w/tsrpc | master | 419 | 99.3% | 74 | 14,860 |
uni-app | dcloudio/uni-app | dev | 10,295 | 0.7% | 102 | 11,078 |
Visual Studio Code | microsoft/vscode | main | 117,393 | 93.7% | 4555 | 1,293,371 |
Vue | vuejs/vue | main | 3591 | 96.7% | 388 | 72,050 |
vuejs/core | vuejs/core | main | 5502 | 96.5% | 457 | 121,640 |
Vuetify | vuetifyjs/vuetify | master | 15,303 | 51.4% | 451 | 40,627 |
webgl-operate | cginternals/webgl-operate | master | 1844 | 70.3% | 181 | 44,000 |
webpack | webpack/webpack | main | 16,408 | 0.2% | 72 | 20,931 |
References
- Zhang, D.; Han, S.; Dang, Y.; Lou, J.G.; Zhang, H.; Xie, T. Software Analytics in Practice. IEEE Softw. 2013, 30, 30–37. [Google Scholar] [CrossRef]
- Menzies, T.; Zimmermann, T. Software Analytics: So What? IEEE Softw. 2013, 30, 31–37. [Google Scholar] [CrossRef]
- Pospieszny, P. Software Estimation: Towards Prescriptive Analytics. In Proceedings of the 27th International Workshop on Software Measurement and 12th International Conference on Software Process and Product Measurement, Gothenburg, Sweden, 25–27 October 2017; ACM: New York, NY, USA, 2017; pp. 221–226. [Google Scholar] [CrossRef]
- Zhang, W.; Wang, S.; Yang, Y.; Wang, Q. Heterogeneous Network Analysis of Developer Contribution in Bug Repositories. In Proceedings of the International Conference on Cloud and Service Computing, Beijing, China, 4–6 November 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 98–105. [Google Scholar] [CrossRef]
- Limberger, D.; Scheibel, W.; Döllner, J.; Trapp, M. Visual Variables and Configuration of Software Maps. Springer J. Vis. 2023, 26, 249–274. [Google Scholar] [CrossRef]
- Højelse, K.; Kilbak, T.; Røssum, J.; Jäpelt, E.; Merino, L.; Lungu, M. Git-Truck: Hierarchy-Oriented Visualization of Git Repository Evolution. In Proceedings of the Working Conference on Software Visualization, Limassol, Cyprus, 2–7 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 131–140. [Google Scholar] [CrossRef]
- Paredes, J.; Anslow, C.; Maurer, F. Information Visualization for Agile Software Development. In Proceedings of the 2nd Working Conference on Software Visualization, Victoria, BC, Canada, 29–30 September 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 157–166. [Google Scholar] [CrossRef]
- Bird, C.; Rigby, P.C.; Barr, E.T.; Hamilton, D.J.; German, D.M.; Devanbu, P. The Promises and Perils of Mining git. In Proceedings of the 6th International Working Conference on Mining Software Repositories, Vancouver, Canada, 16–17 May 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 1–10. [Google Scholar] [CrossRef]
- Kalliamvakou, E.; Gousios, G.; Blincoe, K.; Singer, L.; German, D.M.; Damian, D. The Promises and Perils of Mining GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories, Hyderabad, India, 31 May–1 June 2014; ACM: New York, NY, USA, 2014; pp. 92–101. [Google Scholar] [CrossRef]
- Vargas, E.L.; Hejderup, J.; Kechagia, M.; Bruntink, M.; Gousios, G. Enabling Real-Time Feedback in Software Engineering. In Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results, Gothenburg, Sweden, 27 May–3 June 2018; ACM: New York, NY, USA, 2018; pp. 21–24. [Google Scholar] [CrossRef]
- Czerwonka, J.; Nagappan, N.; Schulte, W.; Murphy, B. CODEMINE: Building a Software Development Data Analytics Platform at Microsoft. IEEE Softw. 2013, 30, 64–71. [Google Scholar] [CrossRef]
- Maddila, C.; Shanbhogue, S.; Agrawal, A.; Zimmermann, T.; Bansal, C.; Forsgren, N.; Agrawal, D.; Herzig, K.; van Deursen, A. Nalanda: A Socio-Technical Graph Platform for Building Software Analytics Tools at Enterprise Scale. In Proceedings of the 30th Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Singapore, 14–18 November 2022; ACM: New York, NY, USA, 2022; pp. 1246–1256. [Google Scholar] [CrossRef]
- Shahin, M.; Ali Babar, M.; Zhu, L. Continuous Integration, Delivery and Deployment: A Systematic Review on Approaches, Tools, Challenges and Practices. IEEE Access 2017, 5, 3909–3943. [Google Scholar] [CrossRef]
- Henry, G. Dave Cross on GitHub Actions. IEEE Softw. 2024, 41, 146–148. [Google Scholar] [CrossRef]
- Hassan, A.E. The road ahead for Mining Software Repositories. In Proceedings of the Frontiers of Software Maintenance, Beijing, China, 28 September–4 October 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 48–57. [Google Scholar] [CrossRef]
- Decan, A.; Mens, T.; Mazrae, P.R.; Golzadeh, M. On the Use of GitHub Actions in Software Development Repositories. In Proceedings of the International Conference on Software Maintenance and Evolution, Limassol, Cyprus, 2–7 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 235–245. [Google Scholar] [CrossRef]
- Khatami, A.; Zaidman, A. Quality Assurance Awareness in Open Source Software Projects on GitHub. In Proceedings of the 23rd International Working Conference on Source Code Analysis and Manipulation, Bogotá, Colombia, 1–2 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 174–185. [Google Scholar] [CrossRef]
- Honglei, T.; Wei, S.; Yanan, Z. The Research on Software Metrics and Software Complexity Metrics. In Proceedings of the International Forum on Computer Science-Technology and Applications, Chongqing, China, 25–27 December 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 131–136. [Google Scholar] [CrossRef]
- Sui, L.; Dietrich, J.; Tahir, A.; Fourtounis, G. On the Recall of Static Call Graph Construction in Practice. In Proceedings of the 42nd International Conference on Software Engineering, Seoul, Republic of Korea, 7–11 July 2020; ACM: New York, NY, USA, 2020; pp. 1049–1060. [Google Scholar] [CrossRef]
- Chidamber, S.R.; Kemerer, C.F. A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 1994, 20, 476–493. [Google Scholar] [CrossRef]
- Atzberger, D.; Scordialo, N.; Cech, T.; Scheibel, W.; Trapp, M.; Döllner, J. CodeCV: Mining Expertise of GitHub Users from Coding Activities. In Proceedings of the 22nd International Working Conference on Source Code Analysis and Manipulation, Limassol, Cyprus, 3–4 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 143–147. [Google Scholar] [CrossRef]
- Bozzelli, P.; Gu, Q.; Lago, P. A Systematic Literature Review on Green Software Metrics; Technical Report; VU University: Amsterdam, The Netherlands, 2013. [Google Scholar]
- Ludwig, J.; Xu, S.; Webber, F. Compiling static software metrics for reliability and maintainability from GitHub repositories. In Proceedings of the International Conference on Systems, Man, and Cybernetics, Banff, Canada, 5–8 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 5–9. [Google Scholar] [CrossRef]
- Spadini, D.; Aniche, M.; Bacchelli, A. Pydriller: Python framework for mining software repositories. In Proceedings of the 26th Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Lake Buena Vista, FL, USA, 4–9 November 2018; ACM: New York, NY, USA, 2018; pp. 908–911. [Google Scholar] [CrossRef]
- Reza, S.M.; Badreddin, O.; Rahad, K. ModelMine: A tool to facilitate mining models from open source repositories. In Proceedings of the 23rd International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings, Virtual Event, 16–23 October 2020; ACM: New York, NY, USA, 2020; pp. 9:1–9:5. [Google Scholar] [CrossRef]
- Casalnuovo, C.; Suchak, Y.; Ray, B.; Rubio-González, C. GitcProc: A tool for processing and classifying GitHub commits. In Proceedings of the 26th SIGSOFT International Symposium on Software Testing and Analysis, Santa Barbara, CA, USA, 10–14 July 2017; ACM: New York, NY, USA, 2017; pp. 396–399. [Google Scholar]
- Terceiro, A.; Costa, J.; Miranda, J.; Meirelles, P.; Rios, L.R.; Almeida, L.; Chavez, C.; Kon, F. Analizo: An Extensible Multi-Language Source Code Analysis and Visualization Toolkit. In Proceedings of the Brazilian Conference on Software: Theory and Practice—Tools, Salvador, Bahia, Brazil, 27 September–1 October 2010. [Google Scholar]
- Fu, M.; Tantithamthavorn, C. LineVul: A Transformer-Based Line-Level Vulnerability Prediction. In Proceedings of the 19th International Conference on Mining Software Repositories, Pittsburgh, PA, USA, 23–24 May 2022; ACM: New York, NY, USA, 2022; pp. 608–620. [Google Scholar] [CrossRef]
- Collard, M.L.; Decker, M.J.; Maletic, J.I. srcML: An infrastructure for the exploration, analysis, and manipulation of source code: A tool demonstration. In Proceedings of the International Conference on Software Maintenance, Eindhoven, The Netherlands, 22–28 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 516–519. [Google Scholar] [CrossRef]
- Dyer, R.; Nguyen, H.A.; Rajan, H.; Nguyen, T.N. Boa: A language and infrastructure for analyzing ultra-large-scale software repositories. In Proceedings of the 35th International Conference on Software Engineering, San Francisco, CA, USA, 18–26 May 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 422–431. [Google Scholar] [CrossRef]
- Gousios, G. The GHTorrent dataset and tool suite. In Proceedings of the 10th Working Conference on Mining Software Repositories, San Francisco, CA, USA, 18–19 May 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 233–236. [Google Scholar] [CrossRef]
- Mattis, T.; Rein, P.; Hirschfeld, R. Three trillion lines: Infrastructure for mining GitHub in the classroom. In Proceedings of the Conference Companion of the 4th International Conference on Art, Science, and Engineering of Programming, Porto, Portugal, 23–26 March 2020; ACM: New York, NY, USA, 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Trautsch, A.; Trautsch, F.; Herbold, S.; Ledel, B.; Grabowski, J. The SmartSHARK ecosystem for software repository mining. In Proceedings of the 42nd International Conference on Software Engineering: Companion Proceedings, Seoul, South Korea, 7–11 July 2020; ACM: New York, NY, USA, 2020; pp. 25–28. [Google Scholar] [CrossRef]
- Kolovos, D.; Neubauer, P.; Barmpis, K.; Matragkas, N.; Paige, R. Crossflow: A framework for distributed mining of software repositories. In Proceedings of the 16th International Conference on Mining Software Repositories, Montreal, Canada, 26–27 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 155–159. [Google Scholar] [CrossRef]
- Dueñas, S.; Cosentino, V.; Robles, G.; Gonzalez-Barahona, J.M. Perceval: Software Project Data at Your Will. In Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings, Melbourne, Australia, 14–20 May 2018; ACM: New York, NY, USA, 2018; pp. 1–4. [Google Scholar] [CrossRef]
- Foltin, E.; Dumke, R.R. Aspects of software metrics database design. Softw. Process. Improv. Pract. 1998, 4, 33–42. [Google Scholar] [CrossRef]
- Prause, C.R.; Hönle, A. Emperor’s New Clothes: Transparency Through Metrication in Customer-Supplier Relationships. In PROFES 2018: Product-Focused Software Process Improvement; Springer: Cham, Switzerland, 2018; pp. 288–296. [Google Scholar] [CrossRef]
- Sayyad Shirabad, J.; Menzies, T. The PROMISE Repository of Software Engineering Databases; School of Information Technology and Engineering, University of Ottawa: Ottawa, Canada, 2005. [Google Scholar]
- Scheibel, W.; Hartmann, J.; Limberger, D.; Döllner, J. Visualization of Tree-structured Data using Web Service Composition. In VISIGRAPP 2019: Computer Vision, Imaging and Computer Graphics Theory and Applications; Springer: Cham, Switzerland, 2020; pp. 227–252. [Google Scholar] [CrossRef]
- Heseding, F.; Scheibel, W.; Döllner, J. Tooling for Time- and Space-Efficient Git Repository Mining. In Proceedings of the 19th International Conference on Mining Software Repositories, Pittsburgh, PA, USA, 23–24 May 2022; ACM: New York, NY, USA, 2022; pp. 413–417. [Google Scholar] [CrossRef]
- D’Ambros, M.; Lanza, M.; Robbes, R. An extensive comparison of bug prediction approaches. In Proceedings of the 7th Working Conference on Mining Software Repositories, Cape Town, South Africa, 2–3 May 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 31–41. [Google Scholar] [CrossRef]
- Reniers, D.; Voinea, L.; Ersoy, O.; Telea, A.C. The Solid* toolset for software visual analytics of program structure and metrics comprehension: From research prototype to product. Elsevier Sci. Comput. Program. 2014, 79, 224–240. [Google Scholar] [CrossRef]
- Dick, S.; Meeks, A.; Last, M.; Bunke, H.; Kandel, A. Data mining in software metrics databases. Fuzzy Sets Syst. 2004, 145, 81–110. [Google Scholar] [CrossRef]
- Ball, T.; Eick, S. Software visualization in the large. IEEE Comput. 1996, 29, 33–43. [Google Scholar] [CrossRef]
- Scheibel, W.; Trapp, M.; Limberger, D.; Döllner, J. A Taxonomy of Treemap Visualization Techniques. In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Valletta, Malta, 27–29 February 2020; SciTePress: Setúbal, Portugal, 2020; pp. 273–280. [Google Scholar] [CrossRef]
- Johnson, B.S.; Shneiderman, B. Tree-Maps: A Space-filling Approach to the Visualization of Hierarchical Information Structures. In Proceedings of the 2nd Conference on Visualization, San Diego, CA, USA, 22–25 October 1991; IEEE: Piscataway, NJ, USA, 1991; pp. 284–291. [Google Scholar] [CrossRef]
- Holten, D.; Vliegen, R.; van Wijk, J. Visual Realism for the Visualization of Software Metrics. In Proceedings of the 3rd International Workshop on Visualizing Software for Understanding and Analysis, Budapest, Hungary, 25 September 2005; IEEE: Piscataway, NJ, USA, 2005; pp. 1–6. [Google Scholar] [CrossRef]
- Wettel, R.; Lanza, M. Visualizing Software Systems as Cities. In Proceedings of the 4th International Workshop on Visualizing Software for Understanding and Analysis, Banff, Canada, 25–26 June 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 92–99. [Google Scholar] [CrossRef]
- Steinbrückner, F.; Lewerentz, C. Understanding Software Evolution with Software Cities. SAGE Inf. Vis. 2013, 12, 200–216. [Google Scholar] [CrossRef]
- Kuhn, A.; Loretan, P.; Nierstrasz, O. Consistent Layout for Thematic Software Maps. In Proceedings of the 15th Working Conference on Reverse Engineering, Antwerp, Belgium, 15–18 October 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 209–218. [Google Scholar] [CrossRef]
- Atzberger, D.; Cech, T.; Scheibel, W.; Limberger, D.; Döllner, J. Visualization of Source Code Similarity using 2.5D Semantic Software Maps. In VISIGRAPP 2021: Computer Vision, Imaging and Computer Graphics Theory and Applications; Springer: Cham, Switzerland, 2023; pp. 162–182. [Google Scholar] [CrossRef]
- Sokol, F.Z.; Aniche, M.F.; Gerosa, M.A. MetricMiner: Supporting researchers in mining software repositories. In Proceedings of the 13th International Working Conference on Source Code Analysis and Manipulation, Eindhoven, The Netherlands, 22–23 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 142–146. [Google Scholar] [CrossRef]
- Dueñas, S.; Cosentino, V.; Gonzalez-Barahona, J.M.; San Felix, A.d.C.; Izquierdo-Cortazar, D.; Cañas-Díaz, L.; García-Plaza, A.P. GrimoireLab: A toolset for software development analytics. PeerJ Comput. Sci. 2021, 7, e601. [Google Scholar] [CrossRef] [PubMed]
- Archambault, D.; Purchase, H.; Pinaud, B. Animation, Small Multiples, and the Effect of Mental Map Preservation in Dynamic Graphs. IEEE Trans. Vis. Comput. Graph. 2011, 17, 539–552. [Google Scholar] [CrossRef] [PubMed]
- Ma, Y.; Dey, T.; Bogart, C.; Amreen, S.; Valiev, M.; Tutko, A.; Kennard, D.; Zaretzki, R.; Mockus, A. World of code: Enabling a research workflow for mining and analyzing the universe of open source VCS data. Empir. Softw. Eng. 2021, 26, 1–42. [Google Scholar] [CrossRef]
- Hoepman, J.H.; Jacobs, B. Increased Security through Open Source. Commun. ACM 2007, 50, 79–83. [Google Scholar] [CrossRef]
- Wermke, D.; Wöhler, N.; Klemmer, J.H.; Fourné, M.; Acar, Y.; Fahl, S. Committed to Trust: A Qualitative Study on Security & Trust in Open Source Software Projects. In Proceedings of the Symposium on Security and Privacy, San Francisco, CA, USA, 23–26 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1880–1896. [Google Scholar] [CrossRef]
- Mayer, P.; Bauer, A. An Empirical Analysis of the Utilization of Multiple Programming Languages in Open Source Projects. In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, Nanjing, China, 27–29 April 2015; ACM: New York, NY, USA, 2015; pp. 4:1–4:10. [Google Scholar] [CrossRef]
- Li, D.; Wang, W.; Zhao, Y. Intelligent Visual Representation for Java Code Data in the Field of Software Engineering Based on Remote Sensing Techniques. Electronics 2023, 12, 5009. [Google Scholar] [CrossRef]
- Atzberger, D.; Cech, T.; de la Haye, M.; Söchting, M.; Scheibel, W.; Limberger, D.; Döllner, J. Software Forest: A Visualization of Semantic Similarities in Source Code using a Tree Metaphor. In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Virtual Event, 8–10 February 2021; SciTePress: Setúbal, Portugal, 2021; pp. 112–122. [Google Scholar] [CrossRef]
- Meirelles, P.; Santos, C., Jr.; Miranda, J.; Kon, F.; Terceiro, A.; Chavez, C. A Study of the Relationships between Source Code Metrics and Attractiveness in Free Software Projects. In Proceedings of the Brazilian Symposium on Software Engineering, Salvador, Bahia, Brazil, 27 September–1 October 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 11–20. [Google Scholar] [CrossRef]
- Ray, B.; Posnett, D.; Filkov, V.; Devanbu, P. A large scale study of programming languages and code quality in GitHub. In Proceedings of the 22nd SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, China, 16–21 November 2014; ACM: New York, NY, USA, 2014; pp. 155–165. [Google Scholar] [CrossRef]
Project | Location | Branch | # Commits | % TS | # Files | # LoC |
---|---|---|---|---|---|---|
AFFiNE | toeverything/AFFiNE | canary | 5012 | 98.1% | 705 | 58,822 |
Angular | angular/angular | main | 28,924 | 84.5% | 6438 | 762,820 |
Angular CLI | angular/angular-cli | main | 14,499 | 94.6% | 1074 | 138,552 |
Angular Components | angular/components | main | 11,413 | 81.0% | 2074 | 269,875 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Scheibel, W.; Blum, J.; Lauterbach, F.; Atzberger, D.; Döllner, J. Integrated Visual Software Analytics on the GitHub Platform. Computers 2024, 13, 33. https://doi.org/10.3390/computers13020033
Scheibel W, Blum J, Lauterbach F, Atzberger D, Döllner J. Integrated Visual Software Analytics on the GitHub Platform. Computers. 2024; 13(2):33. https://doi.org/10.3390/computers13020033
Chicago/Turabian StyleScheibel, Willy, Jasper Blum, Franziska Lauterbach, Daniel Atzberger, and Jürgen Döllner. 2024. "Integrated Visual Software Analytics on the GitHub Platform" Computers 13, no. 2: 33. https://doi.org/10.3390/computers13020033
APA StyleScheibel, W., Blum, J., Lauterbach, F., Atzberger, D., & Döllner, J. (2024). Integrated Visual Software Analytics on the GitHub Platform. Computers, 13(2), 33. https://doi.org/10.3390/computers13020033