Abstract
These tutorial notes present a methodology for spreadsheet engineering. First, we present data mining and database techniques to reason about spreadsheet data. These techniques are used to compute relationships between spreadsheet elements (cells/columns/rows), which are later used to infer a model defining the business logic of the spreadsheet. Such a model of a spreadsheet data is a visual domain specific language that we embed in a well-known spreadsheet system.
The embedded model is the building block to define techniques for model-driven spreadsheet development, where advanced techniques are used to guarantee the model-instance synchronization. In this model-driven environment, any user data update has to follow the model-instance conformance relation, thus, guiding spreadsheet users to introduce correct data. Data refinement techniques are used to synchronize models and instances after users update/evolve the model.
These notes briefly describe our model-driven spreadsheet environment, the MDSheet environment, that implements the presented methodology. To evaluate both proposed techniques and the MDSheet tool, we have conducted, in laboratory sessions, an empirical study with the summer school participants. The results of this study are presented in these notes.
This work is part funded by ERDF - European Regional Development Fund through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the FCT - Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) within projects FCOMP-01-0124-FEDER-010048, and FCOMP-01-0124-FEDER-020532. The first author was funded by FCT grant SFRH/BPD/73358/2010.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Microsoft Excel: http://office.microsoft.com/en-us/excel.
- 2.
Gnumeric: http://projects.gnome.org/gnumeric.
- 3.
OpenOffice: http://www.openoffice.org.
- 4.
LibreOffice: http://www.libreoffice.org.
- 5.
Google Drive: http://drive.google.com.
- 6.
Microsoft Office 365: http://www.microsoft.com/en-us/office365/online-software.aspx.
- 7.
ZoHo Sheet: http://sheet.zoho.com.
- 8.
This list of horror stories is available at: http://www.eusprig.org/horror-stories.htm.
- 9.
HaExcel can be found at http://ssaapp.di.uminho.pt.
- 10.
We omit here the column labels, whose names depend on the number of columns in the generated table.
- 11.
We assume colors are visible in the digital version of this paper.
- 12.
“It allows to assign more precise types to data constructors by restricting the variables of the datatype in the constructors’ result types.”
- 13.
References
Abraham, R., Erwig, M.: Header and unit inference for spreadsheets through spatial analyses. In: 2004 IEEE Symposium on Visual Languages and Human Centric Computing, pp. 165–172, September 2004
Abraham, R., Erwig, M.: UCheck: a spreadsheet type checker for end users. J. Vis. Lang. Comput. 18(1), 71–95 (2007)
Abraham, R., Erwig, M.: Goal-directed debugging of spreadsheets. In: VL/HCC, pp. 37–44. IEEE Computer Society (2005)
Abraham, R., Erwig, M.: Autotest: a tool for automatic test case generation in spreadsheets. In: Proceedings of the 2006 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC 2006), pp. 43–50. IEEE Computer Society (2006)
Abraham, R., Erwig, M.: Inferring templates from spreadsheets. In: Proceedings of the 28th International Conference on Software Engineering, pp. 182–191. ACM, New York (2006)
Abraham, R., Erwig, M.: Type inference for spreadsheets. In: Bossi, A., Maher, M.J. (eds.) Proceedings of the 8th International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming, Venice, Italy, 10–12 July 2006, pp. 73–84. ACM (2006)
Abraham, R., Erwig, M.: Goaldebug: a spreadsheet debugger for end users. In: ICSE 2007: Proceedings of the 29th International Conference on Software Engineering, pp. 251–260. IEEE Computer Society, Washington, DC (2007)
Abraham, R., Erwig, M.: Mutation operators for spreadsheets. IEEE Trans. Softw. Eng. 35(1), 94–108 (2009)
Abraham, R., Erwig, M., Kollmansberger, S., Seifert, E.: Visual specifications of correct spreadsheets. In: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2005, pp. 189–196. IEEE Computer Society (2005)
Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques and Tools. Addison Wesley, Reading (1986)
Alhajj, R.: Extracting the extended entity-relationship model from a legacy relational database. Inf. Syst. 28(6), 597–618 (2003)
Alves, T.L., Silva, P.F., Visser, J.: Constraint-aware schema transformation. Electron. Notes Theor. Comput. Sci. 290, 3–18 (2012)
Bricklin, D.: VisiCalc: Information from its creators, Dan Bricklin and Bob Frankston. http://www.bricklin.com/visicalc.htm. Accessed 5 Dec 2013
Bruins, E.: On Plimpton 322. Pythagorean numbers in Babylonian mathematics. Koninklijke Nederlandse Akademie van Wetenschappen 52, 629–632 (1949)
Burnett, M., Cook, C., Pendse, O., Rothermel, G., Summet, J., Wallace, C.: End-user software engineering with assertions in the spreadsheet paradigm. In: Proceedings of the 25th International Conference on Software Engineering, ICSE 2003, pp. 93–103. IEEE Computer Society (2003)
Campbell-Kelly, M., Croarken, M., Flood, R., Robson, E.: The History of Mathematical Tables: From Sumer to Spreadsheets. Oxford University Press, Oxford (2003)
Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)
Cunha, A., Oliveira, J.N., Visser, J.: Type-safe two-level data transformation. In: Misra, J., Nipkow, T., Sekerinski, E. (eds.) FM 2006. LNCS, vol. 4085, pp. 284–299. Springer, Heidelberg (2006)
Cunha, J., Erwig, M., Saraiva, J.: Automatically inferring classsheet models from spreadsheets. In: IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2010, pp. 93–100. IEEE Computer Society (2010)
Cunha, J., Fernandes, J., Mendes, J., Saraiva, J.: Embedding, evolution, and validation of model-driven spreadsheets. IEEE Trans. Software Eng. PP(99), 1 (2014)
Cunha, J., Fernandes, J.P., Ribeiro, H., Saraiva, J.: Towards a catalog of spreadsheet smells. In: Murgante, B., Gervasi, O., Misra, S., Nedjah, N., Rocha, A.M.A.C., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2012, Part IV. LNCS, vol. 7336, pp. 202–216. Springer, Heidelberg (2012)
Cunha, J., Fernandes, J.P., Mendes, J., Martins, P., Saraiva, J.: Smellsheet detective: a tool for detecting bad smells in spreadsheets. In: Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2012, pp. 243–244. IEEE Computer Society, Washington, DC (2012)
Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: Extension and implementation of ClassSheet models. In: Proceedings of the 2012 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2012, pp. 19–22. IEEE Computer Society (2012)
Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J.: MDSheet: a framework for model-driven spreadsheet engineering. In: Proceedings of the 34th International Conference on Software Engineering, ICSE 2012, pp. 1412–1415. ACM (2012)
Cunha, J., Fernandes, J.P., Saraiva, J.: From relational ClassSheets to UML+OCL. In: Proceedings of the Software Engineering Track at the 27th Annual ACM Symposium on Applied Computing, pp. 1151–1158. ACM (2012)
Cunha, J., Mendes, J., Fernandes, J.P., Saraiva, J.: Embedding and evolution of spreadsheet models in spreadsheet systems. In: Proceedings of the 2011 IEEE Symposium on Visual Languages and Human-Centric Computing, VLHCC 2011, pp. 186–201. IEEE (2011)
Cunha, J., Saraiva, J., Visser, J.: Model-based programming environments for spreadsheets. Sci. Comput. Program. (SCP) 96, 254–275 (2014)
Cunha, J., Visser, J., Alves, T., Saraiva, J.: Type-safe evolution of spreadsheets. In: Giannakopoulou, D., Orejas, F. (eds.) FASE 2011. LNCS, vol. 6603, pp. 186–201. Springer, Heidelberg (2011)
Engels, G., Erwig, M.: ClassSheets: automatic generation of spreadsheet applications from object-oriented specifications. In: Proceedings of the 20th IEEE/ACM International Conference on Automated Software Engineering, pp. 124–133. ACM (2005)
Erdweg, S., et al.: The state of the art in language workbenches. In: Erwig, M., Paige, R.F., Van Wyk, E. (eds.) SLE 2013. LNCS, vol. 8225, pp. 197–217. Springer, Heidelberg (2013)
Erwig, M.: Software engineering for spreadsheets. IEEE Softw. 29(5), 25–30 (2009)
Erwig, M., Abraham, R., Cooperstein, I., Kollmansberger, S.: Automatic generation and maintenance of correct spreadsheets. In: Proceedings of the 27th International Conference on Software Engineering, pp. 136–145. ACM (2005)
Erwig, M., Abraham, R., Kollmansberger, S., Cooperstein, I.: Gencel: a program generator for correct spreadsheets. J. Funct. Program. 16(3), 293–325 (2006)
Erwig, M., Burnett, M.: Adding apples and oranges. In: Adsul, B., Ramakrishnan, C.R. (eds.) PADL 2002. LNCS, vol. 2257, pp. 173–191. Springer, Heidelberg (2002)
Fisher II, M., Cao, M., Rothermel, G., Cook, C., Burnett, M.: Automated test case generation for spreadsheets. In: Proceedings of the 24th International Conference on Software Engineering (ICSE 2002), pp. 141–154. ACM Press, New York, 19–25 May 2002
Fisher II, M., Rothermel, G., Brown, D., Cao, M., Cook, C., Burnett, M.: Integrating automated test generation into the WYSIWYT spreadsheet testing methdology. ACM Trans. Softw. Eng. Methodol. 15(2), 150–194 (2006)
Fisher II, M., Rothermel, G., Creelan, T., Burnett, M.: Scaling a dataflow testing methodology to the multiparadigm world of commercial spreadsheets. In: Proceedings of the 17th IEEE International Symposium on Software Reliability Engineering, Raleigh, NC, USA, pp. 13–22, November 2006
Gibbons, J.: Functional programming for domain-specific languages. In: Zsok, V. (ed.) Central European Functional Programming - Summer School on Domain-Specific Languages, July 2013
Hermans, F., Pinzger, M., van Deursen, A.: Automatically extracting class diagrams from spreadsheets. In: D’Hondt, T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 52–75. Springer, Heidelberg (2010)
Hermans, F., Pinzger, M., van Deursen, A.: Supporting professional spreadsheet users by generating leveled dataflow diagrams. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011, pp. 451–460. ACM (2011)
Hermans, F., Pinzger, M., van Deursen, A.: Detecting and visualizing inter-worksheet smells in spreadsheets. In: Proceedings of the 2012 International Conference on Software Engineering, ICSE 2012, pp. 441–451. IEEE Press (2012)
Hermans, F., Pinzger, M., van Deursen, A.: Detecting code smells in spreadsheet formulas. In: ICSM, pp. 409–418 (2012)
Hinze, R., Löh, A., Oliveira, B.C.S.: “Scrap your boilerplate” reloaded. In: Hagiya, M. (ed.) FLOPS 2006. LNCS, vol. 3945, pp. 13–29. Springer, Heidelberg (2006)
Hudak, P.: Building domain-specific embedded languages. ACM Comput. Surv. 28(4es), 196 (1996)
Jones, S.P., Blackwell, A., Burnett, M.: A user-centred approach to functions in excel. In: Proceedings of the 8th ACM SIGPLAN International Conference on Functional Programming, ICFP 2003, pp. 165–176. ACM (2003)
Kankuzi, B., Sajaniemi, J.: An empirical study of spreadsheet authors’ mental models in explaining and debugging tasks. In: 2013 IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2013, pp. 15–18 (2013)
Kuiper, M., Saraiva, J.: Lrc - a generator for incremental language-oriented tools. In: Koskimies, K. (ed.) CC 1998. LNCS, vol. 1383, pp. 298–301. Springer, Heidelberg (1998)
Lämmel, R., Visser, J.: A Strafunski application letter. In: Dahl, V. (ed.) PADL 2003. LNCS, vol. 2562, pp. 357–375. Springer, Heidelberg (2002)
Lämmel, R., Saraiva, J., Visser, J. (eds.): GTTSE 2005. LNCS, vol. 4143. Springer, Heidelberg (2006)
Luckey, M., Erwig, M., Engels, G.: Systematic evolution of model-based spreadsheet applications. J. Vis. Lang. Comput. 23(5), 267–286 (2012)
Maier, D.: The Theory of Relational Databases. Computer Science Press, Rockville (1983)
Morgan, C., Gardiner, P.: Data refinement by calculation. Acta Inform. 27, 481–503 (1990)
Nardi, B.A.: A Small Matter of Programming: Perspectives on End User Computing, 1st edn. MIT Press, Cambridge (1993)
Oliveira, J.: A reification calculus for model-oriented software specification. Form. Asp. Comput. 2(1), 1–23 (1990)
Oliveira, J.N.: Transforming data by calculation. In: Lämmel, R., Visser, J., Saraiva, J. (eds.) Generative and Transformational Techniques in Software Engineering II. LNCS, vol. 5235, pp. 134–195. Springer, Heidelberg (2008)
Panko, R.R.: What we know about spreadsheet errors. J. End User Comput. (Special issue on Scaling Up End User Development) 10(2), 15–21 (1998)
Panko, R.R.: Spreadsheet errors: what we know. what we think we can do. In: Proceedings of the European Spreadsheet Risks Interest Group (EuSpRIG) (2000)
Panko, R.R.: Facing the problem of spreadsheet errors. Decis. Line 37(5), 8–10 (2006)
Panko, R.R., Aurigemma, S.: Revising the panko-halverson taxonomy of spreadsheet errors. Decis. Support Syst. 49(2), 235–244 (2010)
Panko, R.R., Ordway, N.: Sarbanes-Oxley: What About all the Spreadsheets? CoRR abs/0804.0797 (2008)
Peyton Jones, S., Washburn, G., Weirich, S.: Wobbly types: type inference for generalised algebraic data types. Technical report, MS-CIS-05-26, University of Pennsylvania, July 2004
Powell, S.G., Baker, K.R., Lawson, B.: A critical review of the literature on spreadsheet errors. Decis. Support Syst. 46(1), 128–138 (2008)
Rajalingham, K., Chadwick, D.R., Knight, B.: Classification of spreadsheet errors. In: Proceedings of the 2001 European Spreadsheet Risks Interest Group, EuSpRIG 2001, Amsterdam (2001)
Reinhart, C.M., Rogoff, K.S.: Growth in a time of debt. Am. Econ. Rev. 100(2), 573–578 (2010)
Robson, E.: Neither Sherlock Holmes nor Babylon: a reassessment of Plimpton 322. Historia Mathematica 28(3), 167–206 (2001)
Rothermel, G., Burnett, M., Li, L., Sheretov, A.: A methodology for testing spreadsheets. ACM Trans. Softw. Eng. Methodol. 10, 110–147 (2001)
Ruthruff, J., Creswick, E., Burnett, M., Cook, C., Prabhakararao, S., Fisher II, M., Main, M.: End-user software visualizations for fault localization. In: Proceedings of the ACM Symposium on Software Visualization, San Diego, CA, USA, pp. 123–132, June 2003
Scaffidi, C., Shaw, M., Myers, B.: Estimating the numbers of end users and end user programmers. In: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing, pp. 207–214 (2005)
Stevens, P., Whittle, J., Booch, G. (eds.): UML 2003. LNCS, vol. 2863. Springer, Heidelberg (2003)
Swierstra, D., Azero, P., Saraiva, J.: Designing and implementing combinator languages. In: Swierstra, S.D., Oliveira, J.N. (eds.) AFP 1998. LNCS, vol. 1608, pp. 150–206. Springer, Heidelberg (1999)
Ullman, J.D., Widom, J.: A First Course in Database Systems. Prentice Hall, Upper Saddle River (1997)
Ullman, J.: Principles of Database and Knowledge-Base Systems, vol. I. Computer Science Press, Rockville (1988)
Visser, E.: A survey of strategies in rule-based program transformation systems. J. Symbolic Comput. 40, 831–873 (2005)
Visser, J., Saraiva, J.: Tutorial on strategic programming across programming paradigms. In: 8th Brazilian Symposium on Programming Languages, Niteroi, Brazil, May 2004
Acknowledgments
The theories, techniques and tools presented in this tutorial paper were developed under the project SSaaPP - SpreadSheets as a Programming Paradigm: a research project funded by the Portuguese Science Foundation (contract number FCOMP-01-0124-FEDER-010048). We would like to thank the members and consultants of this project who made important contributions for the results presented in this document, namely: Rui Maranhão Abreu, Tiago Alves, Laura Beckwith, Orlando Belo, Martin Erwig, Pedro Martins, Hugo Pacheco, Christophe Peixoto, Rui Pereira, Alexandre Perez, Hugo Ribeiro, André Riboira, André Silva, and Joost Visser.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Cunha, J., Fernandes, J.P., Mendes, J., Saraiva, J. (2015). Spreadsheet Engineering. In: Zsók, V., Horváth, Z., Csató, L. (eds) Central European Functional Programming School. CEFP 2013. Lecture Notes in Computer Science(), vol 8606. Springer, Cham. https://doi.org/10.1007/978-3-319-15940-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-15940-9_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15939-3
Online ISBN: 978-3-319-15940-9
eBook Packages: Computer ScienceComputer Science (R0)