Abstract
Static type systems play an essential role in contemporary programming languages. Despite their importance, whether static type systems impact human software development capabilities remains open. One frequently mentioned argument in favor of static type systems is that they improve the maintainability of software systems—an often-used claim for which there is little empirical evidence. This paper describes an experiment that tests whether static type systems improve the maintainability of software systems, in terms of understanding undocumented code, fixing type errors, and fixing semantic errors. The results show rigorous empirical evidence that static types are indeed beneficial to these activities, except when fixing semantic errors. We further conduct an exploratory analysis of the data in order to understand possible reasons for the effect of type systems on the three kinds of tasks used in this experiment. From the exploratory analysis, we conclude that developers using a dynamic type system tend to look at different files more frequently when doing programming tasks—which is a potential reason for the observed differences in time.
Similar content being viewed by others
Notes
The interested reader can find more arguments for both cases online, including the lively discussion available at: http://programmers.stackexchange.com/questions/122205/
For more information about Groovy and the differences it has with Java, the interested reader can consult the following web page: http://groovy.codehaus.org/Differences+from+Java
We plan to gather data on this aspect in future experiments.
Mauchly’s sphericity test was performed using the standard packages implemented in SPSS. The used variables were the within-subject variable programming task and the between-subject variable programming language.
If we anticipate in the next section, we also see that in the numbers of test runs in Fig. 7, where test runs for Groovy are significantly worse for the supposedly simpler CIT1 than they are for CIT2.
Note that the programming tasks in Mayer et al. (2012) were exclusively CIT tasks. Hence, a comparison is only based on the CIT tasks in the here described experiment.
References
Bruce KB (2002) Foundations of object-oriented languages: types and semantics. MIT Press, Cambridge
Bird R, Wadler P (1988) An introduction to functional programming. Prentice Hall International (UK) Ltd., Hertfordshire
Cardelli L (1997) Type systems. In: Tucker AB (ed) The computer science and engineering handbook, chap 103. CRC Press, Boca Raton, pp 2208–2236
Callaú O, Robbes R, Tanter É, Röthlisberger D (2013) How (and Why) developers use the dynamic features of programming languages: the case of small talk. Empir Softw Eng 18(6):1156–1194
Curtis B (1988) Five paradigms in the psychology of programming. In: Helander M (ed) Handbook of human-computer interaction. Elsevier, North-Holland, pp 87–106
Daly MT, Sazawal V, Foster JS (2009) Work in progress: an empirical study of static typing in ruby. Workshop on evaluation and usability of programming languages and tools (PLATEAU). Orlando, Florida, October 2009
Denny P, Luxton-Reilly A, Tempero E (2012) All syntax errors are not equal. In: Proceedings of the 17th ACM annual conference on innovation and technology in computer science education, ITiCSE ’12. ACM, New York, pp 75–80
Endrikat S, Hanenberg S (2011) Is aspect-oriented programming a rewarding investment into future code changes? A socio-technical study on development and maintenance time. In: The 19th IEEE international conference on program comprehension, ICPC 2011, Kingston, ON, Canada, June 22–24, 2011, pp 51–60
Feigenspan J, Kästner C, Liebig J, Apel S, Hanenberg S (2012) Measuring programming experience. In: IEEE 20th international conference on program comprehension, ICPC 2012, Passau, Germany, June 11–13, 2012. ICPC’12, pp 73–82
Gannon JD (1977) An experimental evaluation of data type conventions. Commun ACM 20(8):584–595
Gat E (2000) Point of view: LISP as an alternative to Java. Intelligence 11(4):21–24
Gravetter FJ, Wallnau LB (2009) Statistics for the behavioral sciences. Wadsworth Cengage Learning
Hanenberg S (2010) An experiment about static and dynamic type systems: doubts about the positive impact of static type systems on development time. In: Proceedings of the ACM international conference on object oriented programming systems languages and applications, OOPSLA ’10. ACM, New York, pp 22–35
Hanenberg S (2011) A chronological experience report from an initial experiment series on static type systems. In: 2nd workshop on empirical evaluation of software composition techniques (ESCOT). Lancaster
Hudak P, Jones MP (1994) Haskell vs. ada vs. c++ vs. awk vs.... an experiment in software prototyping productivity. Technical report
Hanenberg S, Kleinschmager S, Josupeit-Walter M (2009) Does aspect-oriented programming increase the development speed for crosscutting code? An empirical study. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement, ESEM ’09, Lake Buena Vista. IEEE Computer Society, Florida, pp 156–167
Höst M, Regnell B, Wohlin C (2000) Using students as subjects—a comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng 5(3):201–214
Juristo N, Moreno AM (2001) Basics of software engineering experimentation. Springer
Juzgado NJ, Vegas S (2011) The role of non-exact replications in software engineering experiments. Empir Softw Eng 16(3):295–324
Kitchenham B, Al-Khilidar H, Ali Babar M, Berry M, Cox K, Keung J, Kurniawati F, Staples M, Zhang H, Zhu L (2006) Evaluating guidelines for empirical software engineering studies. In: ISESE ’06: proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering. ACM, New York, pp 38–47
Kleinschmager S, Hanenberg S, Robbes R, Tanter É, Stefik A (2012) Do static type systems improve the maintainability of software systems? An empirical study. In: IEEE 20th international conference on program comprehension, ICPC 2012, Passau, Germany, June 11–13, 2012, pp 153–162
Kleinschmager S (2011) An empirical study using Java and Groovy about the impact of static type systems on developer performance when using and adapting software systems. Master thesis at the institute for computer science and business information systems, University of Duisburg-Essen
Ko AJ, Myers BA, Coblenz MJ, Aung HH (2006) An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks. IEEE Trans Softw Eng 32(12):971–987
McConnell S (2010) What does 10x mean? Measuring variations in programmer productivity. In: Oram A, Wilson G (eds) Making software: what really works, and why we believe it, O’Reilly series. O’Reilly Media, pp 567–575
Mayer C, Hanenberg S, Robbes R, Tanter É, Stefik A (2012) An empirical study of the influence of static type systems on the usability of undocumented software. In: ACM SIGPLAN conference on object-oriented programming systems and applications, OOPSLA ’12
Nierstrasz O, Bergel A, Denker M, Ducasse S, Gälli M, Wuyts R (2005) On the revival of dynamic languages. In: Proceedings of the 4th international conference on software composition, SC’05. Springer-Verlag, Berlin, Heidelberg, pp 1–13
Pfleeger SL (1995) Experimental design and analysis in software engineering. Ann Softw Eng 1:219–253
Pierce BC (2002) Types and programming languages. MIT Press, Cambridge
Prechelt L (2000) An empirical comparison of seven programming languages, IEEE computer (33). Computer 33:23–29
Prechelt L (2001) Kontrollierte experimente in der softwaretechnik. Springer, Berlin
Prechelt L, Tichy WF (1998) A controlled experiment to assess the benefits of procedure argument type checking. IEEE Trans Softw Eng 24(4):302–312
Richards G, Hammer C, Burg B, Vitek J (2011) The eval that men do - a large-scale study of the use of eval in javascript applications. In: ECOOP 2011 - object-oriented programming - 25th European conference, Lancaster, UK, July 25–29, 2011 Proceedings, pp 52–78
Rosenthal R, Rosnow R (2008) Essentials of behavioral research: methods and data analysis. McGraw-Hill higher education. McGraw-Hill Companies, Incorporated
Steinberg M, Hanenberg S (2012) What is the impact of static type systems on debugging type errors and semantic errors? An empirical study of differences in debugging time using statically and dynamically typed languages - unpublished work in progress
Stuchlik A, Hanenberg S (2011) Static vs. dynamic type systems: an empirical study about the relationship between type casts and development time. In: Proceedings of the 7th symposium on dynamic languages, DLS 2011, October 24, 2011, Portland, OR, USA. ACM, pp 97–106
Sjøberg DIK, Hannay JE, Hansen O, Kampenes VB, Karahasanović A, Liborg N-L, Rekdal AC (2005) A survey of controlled experiments in software engineering. IEEE Trans Softw Eng 31(9):733–753
Tichy WF (2000) Hints for reviewing empirical work in software engineering. Empir Softw Eng 5(4):309–312
Tratt L (2009) Dynamically typed languages. Adv Comput 77:149–184
van Deursen A, Moonen L (2006) Documenting software systems using types. Sci Comput Program 60(2):205–220
Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (2000) Experimentation in software engineering: an introduction. Kluwer, Norwell
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Michael Godfrey and Arie van Deursen
Romain Robbes is partially funded by FONDECYT project 11110463, Chile, and by Program U-Apoya, University of Chile ´ Eric Tanter is partially funded by FONDECYT project 1110051, Chile Andreas Stefik is partially funded by the National Science Foundation under grant no. (CNS-0940521). We thank them for their generous support of this work.
Appendix: A Raw Measurement Data
Appendix: A Raw Measurement Data
Rights and permissions
About this article
Cite this article
Hanenberg, S., Kleinschmager, S., Robbes, R. et al. An empirical study on the impact of static typing on software maintainability. Empir Software Eng 19, 1335–1382 (2014). https://doi.org/10.1007/s10664-013-9289-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-013-9289-1