[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

How changes affect software entropy: an empirical study

Published: 01 February 2014 Publication History

Abstract

Context Software systems continuously change for various reasons, such as adding new features, fixing bugs, or refactoring. Changes may either increase the source code complexity and disorganization, or help to reducing it. Aim This paper empirically investigates the relationship of source code complexity and disorganization--measured using source code change entropy--with four factors, namely the presence of refactoring activities, the number of developers working on a source code file, the participation of classes in design patterns, and the different kinds of changes occurring on the system, classified in terms of their topics extracted from commit notes. Method We carried out an exploratory study on an interval of the life-time span of four open source systems, namely ArgoUML, Eclipse-JDT, Mozilla, and Samba, with the aim of analyzing the relationship between the source code change entropy and four factors: refactoring activities, number of contributors for a file, participation of classes in design patterns, and change topics. Results The study shows that (i) the change entropy decreases after refactoring, (ii) files changed by a higher number of developers tend to exhibit a higher change entropy than others, (iii) classes participating in certain design patterns exhibit a higher change entropy than others, and (iv) changes related to different topics exhibit different change entropy, for example bug fixings exhibit a limited change entropy while changes introducing new features exhibit a high change entropy. Conclusions Results provided in this paper indicate that the nature of changes (in particular changes related to refactorings), the software design, and the number of active developers are factors related to change entropy. Our findings contribute to understand the software aging phenomenon and are preliminary to identifying better ways to contrast it.

References

[1]
Aversano L, Canfora G, Cerulo L, Del Grosso C, Di Penta M (2007) An empirical study on the evolution of design patterns. In: ESEC-FSE '07: proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. ACM Press, New York, pp 385-394.
[2]
Aversano L, Cerulo L, Di Penta M (2009) The relationship between design patterns defects and crosscutting concern scattering degree: an empirical study. IET Softw 3(5):395-409.
[3]
Bianchi A, Caivano D, Lanubile F, Visaggio G (2001) Evaluating software degradation through entropy. In: METRICS '01: Proceedings of the 7th international symposium on software metrics. IEEE Computer Society, Washington, DC, p 210.
[4]
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993-1022.
[5]
Canfora G, Cerulo L, Di Penta M, Pacilio F (2010) An exploratory study of factors influencing change entropy. In: The 18th IEEE international conference on program comprehension, ICPC 2010, Braga, Minho, Portugal, 30 June-2 July 2010. IEEE Computer Society, Washington, DC, pp 134-143.
[6]
Capiluppi A, Fernández-Ramil J, Higman J, Sharp HC, Smith N (2007) An empirical study of the evolution of an agile-developed software system. In: 29th international conference on software engineering (ICSE 2007), Minneapolis, MN, USA, 20-26 May 2007. IEEE Computer Society, Washington, DC, pp 511-518.
[7]
Chapin N (1995) An entropy metric for software maintainability. In: Proceedings of the 28th Hawaii international conference on system sciences, pp 522-523.
[8]
Chikofsky EJ, Cross JH II (1990) Reverse engineering and design recovery: a taxonomy. IEEE Softw 7(1):13-17.
[9]
Di PentaM, Germán DM (2009) Who are source code contributors and how do they change? In: 16th working conference on reverse engineering, WCRE 2009, 13-16 October 2009, Lille, France. IEEE Computer Society, Washington, DC, pp 11-20.
[10]
Di Penta M, Germán DM, Guéhéneuc Y-G, Antoniol G (2010) An exploratory study of the evolution of software licensing. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering, ICSE 2010, Cape Town, South Africa, 1-8 May 2010. ACM, New York, pp 145-154.
[11]
Eick SG, Graves TL, Karr AF, Marron JS, Mockus A (2001) Does code decay? Assessing the evidence from change management data. IEEE Trans Softw Eng 27(1):1-12.
[12]
Fowler M, Beck K, Brant J, Opdyke W, Roberts D (1999) Refactoring: improving the design of existing code. Addison-Wesley, Reading.
[13]
Gall H, Jazayeri M, Krajewski J (2003) CVS release history data for detecting logical couplings. In: IWPSE '03: Proceedings of the 6th international workshop on principles of software evolution. IEEE Computer Society, Washington, DC, pp 13-23.
[14]
Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: elements of reusable object oriented software. Addison-Wesley, Reading.
[15]
Grissom RJ, Kim JJ (2005) Effect sizes for research: a broad practical approach, 2nd edn. Lawrence Earlbaum Associates, Hillsdale.
[16]
Harrison W (1992) An entropy-based measure of software complexity. IEEE Trans Softw Eng 18(11):1025-1029.
[17]
Hassan AE (2009) Predicting faults using the complexity of code changes. In: 31st international conference on software engineering, ICSE 2009, 16-24 May 2009, Vancouver, Canada, pp 78- 88.
[18]
Hassan AE, Holt RC (2003) The chaos of software development. In: IWPSE '03: Proceedings of the 6th international workshop on principles of software evolution. IEEE Computer Society, Washington, DC, p 84.
[19]
Holm S (1979) A simple sequentially rejective Bonferroni test procedure. Scand J Statist 6:65-70.
[20]
Kuhn A, Ducasse S, Gírba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49:230-243.
[21]
Lehman MM (1980) Programs life cycles and laws of software evolution. Proc IEEE 68(9):1060- 1076.
[22]
Lehman MM, Belady LA (1985) Software evolution--processes of software change. Academic, London.
[23]
Linstead E, Baldi P (2009) Mining the coherence of gnome bug reports with statistical topic models. In: Proceedings of the 2009 6th IEEE international working conference on mining software repositories, MSR '09. IEEE Computer Society, Washington, DC, pp 99-102.
[24]
Nagappan N, Ball T (2007) Using software dependencies and churn metrics to predict field failures: an empirical case study. In: Proceedings of the first international symposium on empirical software engineering and measurement, ESEM 2007, 20-21 September 2007, Madrid, Spain. IEEE Computer Society, Washington, DC, pp 364-373.
[25]
Parnas DL (1994) Software aging. In: Proceedings of the international conference on software engineering, pp 279-287.
[26]
Ratzinger J, Sigmund T, GallH(2008) On the relation of refactorings and software defect prediction. In: Proceedings of the 2008 international working conference on mining software repositories, MSR 2008, Leipzig, Germany, 10-11 May 2008. ACM, New York, pp 35-38.
[27]
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379-423, 625-656.
[28]
Sheskin DJ (2007) Handbook of parametric and nonparametric statistical procedures, 4th edn. Chapman & Hall, London.
[29]
Thomas SW, Adams B, Hassan AE, Blostein D(2010) Validating the use of topic models for software evolution. In: IEEE international workshop on source code analysis and manipulation. IEEE Computer Society, Los Alamitos, pp 55-64.
[30]
Tsantalis N, Chatzigeorgiou A, Stephanides G, Halkidis ST (2006) Design pattern detection using similarity scoring. IEEE Trans Softw Eng 32(11):896-909.
[31]
van Rijsbergen CJ, Robertson SE, Porter MF (1980) New models in probabilistic information retrieval. In: British Library research and development report, no. 5587. British Library, London.
[32]
Zimmermann T, Weisgerber P, Diehl S, Zeller A (2004) Mining version histories to guide software changes. In: ICSE '04: Proceedings of the 26th international conference on software engineering. IEEE Computer Society, Washington, DC, pp 563-572.

Cited By

View all
  • (2023)A Grounded Theory of Cross-Community SECOs: Feedback Diversity Versus SynchronizationIEEE Transactions on Software Engineering10.1109/TSE.2023.331387549:10(4731-4750)Online publication date: 1-Oct-2023
  • (2022)Who, What, Why and How? Towards the Monetary Incentive in Crowd Collaboration: A Case Study of Github’s Sponsor MechanismProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501822(1-18)Online publication date: 29-Apr-2022
  • (2022)An Investigation of Entropy and Refactoring in Software EvolutionProduct-Focused Software Process Improvement10.1007/978-3-031-21388-5_20(282-297)Online publication date: 21-Nov-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Empirical Software Engineering
Empirical Software Engineering  Volume 19, Issue 1
February 2014
266 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 February 2014

Author Tags

  1. Mining software repositories
  2. Software complexity
  3. Software entropy

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A Grounded Theory of Cross-Community SECOs: Feedback Diversity Versus SynchronizationIEEE Transactions on Software Engineering10.1109/TSE.2023.331387549:10(4731-4750)Online publication date: 1-Oct-2023
  • (2022)Who, What, Why and How? Towards the Monetary Incentive in Crowd Collaboration: A Case Study of Github’s Sponsor MechanismProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501822(1-18)Online publication date: 29-Apr-2022
  • (2022)An Investigation of Entropy and Refactoring in Software EvolutionProduct-Focused Software Process Improvement10.1007/978-3-031-21388-5_20(282-297)Online publication date: 21-Nov-2022
  • (2021)Topic modeling in software engineering researchEmpirical Software Engineering10.1007/s10664-021-10026-026:6Online publication date: 6-Sep-2021
  • (2020)Why Developers Refactor Source CodeACM Transactions on Software Engineering and Methodology10.1145/340830229:4(1-30)Online publication date: 26-Sep-2020
  • (2020)Is Static Analysis Able to Identify Unnecessary Source Code?ACM Transactions on Software Engineering and Methodology10.1145/336826729:1(1-23)Online publication date: 30-Jan-2020
  • (2019)Recommending unnecessary source code based on static analysisProceedings of the 41st International Conference on Software Engineering: Companion Proceedings10.1109/ICSE-Companion.2019.00111(274-275)Online publication date: 25-May-2019
  • (2017)ARENAIEEE Transactions on Software Engineering10.1109/TSE.2016.259153643:2(106-127)Online publication date: 1-Feb-2017
  • (2017)Enhancing developer recommendation with supplementary information via mining historical commitsJournal of Systems and Software10.1016/j.jss.2017.09.021134:C(355-368)Online publication date: 1-Dec-2017
  • (2017)Using contextual information to predict co-changesJournal of Systems and Software10.1016/j.jss.2016.07.016128:C(220-235)Online publication date: 1-Jun-2017
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media