[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content

Advertisement

Log in

DiCoMo: the digitization cost model

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

The estimate of digitization costs is a very difficult task. It is difficult to obtain accurate values because of the great quantity of unknown factors. However, digitization projects need to have a precise idea of the economic costs and the times involved in the development of their contents. The common practice when we start digitizing a new collection is to set a schedule, and a firm commitment to fulfil it (both in terms of cost and deadlines), even before the actual digitization work starts. As it happens with software development projects, incorrect estimates produce delays and cause costs overdrafts. Based on methods used in Software Engineering for software development cost prediction like COCOMO and Function Points, and using historical data gathered during 5 years at the MCDL project, during the digitization of more than 12000 books, we have developed a method for time-and-cost estimates named DiCoMo (Digitization Cost Model) for digital content production in general. This method can be adapted to different production processes, like the production of digital XML or HTML texts using scanning and OCR, and undergoing human proofreading and error correction, or for the production of digital facsimiles (scanning without OCR). The accuracy of the estimates improve with time, since the algorithms can be optimized by making adjustments based on historical data gathered from previous tasks. Finally, we consider the problem of parallelizing tasks, i.e. dividing the work among a number of encoders that will work in parallel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Boehm B.W.: Software engineering economics. Prentice Hall, Englewood Cliffs (1981)

    MATH  Google Scholar 

  2. Magazinovic, A.: Exploring cost estimation inaccuracy: why do practitioners still fail to predict the actuals? Technical report, Department of Computer Science and Engineering, Chalmers University of Technology, Göteborg, Sweden (2008)

  3. Galorath, D.: Software project failure costs billions... Better estimation and planning can help. http://tinyurl.com/Galorath (2008)

  4. Bia A., Pedreño A.: The Miguel de Cervantes Digital Library: the Hispanic Voice on the Web. LLC (Literary and Linguistic Computing) J (Oxford University Press) 16(2), 161–177 (2001)

    Google Scholar 

  5. Bia A.: The use of multimedia to enhance the accessibility of digital library resources: The multicultural-scope of the services offered by the Miguel de Cervantes digital library project. In: Anderson, J., Dunning, A., Fraser, M. (eds) Digital resources for the humanities 2001 and 2002: an edited selection of papers, Office for Humanities Communication, vol. 16, pp. 1–11. King’s College, London (2003)

    Google Scholar 

  6. Nixon, P.G.: The human function curve. Practitioner pp. 765–769; 935–944 (1976)

  7. Bauer K.: Cost analysis of a project to digitize classic articles in neurosurgery. J. Med. Libr. Assoc. (JMLA) 90(2), 230–234. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC100769/ (2002)

    MathSciNet  Google Scholar 

  8. Tanner, S., Smith, J.L.: Digitisation: how much does it really cost? In: Digital resources for the humanities, King’s College, London (1999)

  9. Puglia, S.: The costs of digital imaging projects. RLG DigiNews 3(5). http://chnm.gmu.edu/digitalhistory/links/cached/chapter3/link3.10b.digitalimagingcosts.html (1999)

  10. Lee S.D.: Digitization: is it worth it?. Computer Libraries 21(5), 28–31. http://www.infotoday.com/cilmag/may01/lee.htm (2001)

    Google Scholar 

  11. UMich-MoA: Assessing the costs of conversion: Making of America IV: the American Voice 1850–1876. http://www.lib.umich.edu/files/services/dlps/moa4costs.pdf (2001)

  12. Winer, D.: Good practices in cost reduction for digitisation: resources for minerva and minerva plus WG on good practices. http://www.minervaeurope.org/structure/workinggroups/goodpract/costreduction/documents/wp6costreduction0904.pdf (2004)

  13. Hammond, M., Davies, C.: Understanding the costs of digitisation: detail report. http://www.jisc.ac.uk/media/documents/programmes/digitisation/digitisation-costs-full.pdf (2009)

  14. Research Library Group: RLG worksheet for estimating digital reformatting costs. http://www.oclc.org/research/activities/past/rlg/digimgtools/rlgworksheet.pdf (1998)

  15. Presto-Space: Preservation project cost calculator. http://digitalpreservation.ssl.co.uk/hosted/d13.2/newcalc.php (2007)

  16. Putnam, L.H.: A general empirical solution to the macro software sizing and estimating problem. IEEE Trans. Software Eng. SE-4(4), 345–361, This article introduces the SLIM method (1978)

    Article  Google Scholar 

  17. Boehm B.W., Clark B.K., Horowitz E., Westland C., Madachy R., Selby R.: Cost models for future software life-cycle processes: COCOMO 2.0. In: Arthur, J., Henry, S. (eds) Annals of software engineering special volume on software process and product measurement, vol 1, pp. 45–60. J.C. Baltzer AG, Science Publishers, Amsterdam, The Netherlands (1995)

    Google Scholar 

  18. Clark, B.K., Devnani-Chulani, S., Boehm, B.W.: Calibrating the COCOMO II post-architecture model. In: 20th international conference on software engineering. Center for Software Engineering, Computer Science Department, University of Southern California, Los Angeles (1998)

  19. CSE COCOMO II model definition manual: Center for software Engineering, Computer Science Department, University of Southern California, Los Angeles (1997).

  20. Albrecht, A.J.: Measuring application development productivity. In: Proceedings of the Joint Share/Guide/IBM Applications Development Symposium pp.83–92 (1979)

  21. Albrecht A.J., Gaffney J.E.: Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans. Software Eng. SE-9(6), 639–648 (1983)

    Article  Google Scholar 

  22. Banerjee, G.: Use case points, an estimation approach (2001)

  23. LCI: Use cases and function points. Longstreet Consulting Inc., Blue Springs (2004)

    Google Scholar 

  24. Minkiewicz A.F.: Measuring object oriented software with predictive object points. PRICE Systems, LLC (1997)

    Google Scholar 

  25. Valerdi, R.: The constructive systems engineering cost model (COSYSMO). Phd thesis, University of Southern California. http://csse.usc.edu/csse/TECHRPTS/PhDDissertations/files/ValerdiDissertation.pdf (2005)

  26. Salvetto-de-León, P.F.: Modelos automatizables de estimacióuy temprana del tiempo y esfuerzo de desarrollo de sistemas de información. Phd thesis, Departamento de Lenguajes y Sistemas Informáticos e Ingeniería de Software, Universidad Politécnica de Madrid. Supervisors: Francisco Javier Segovia-Pérez, Juan Carlos Nogueira-de-León. http://oa.upm.es/367/1/PEDROSALVETTOLEON.pdf (2006)

  27. Bia A., Muñoz R., Gómez J.: Estimating digitization costs in digital libraries using DiCoMo. Lectur Notes Comput. Sci. 6273, 136–147 (2010)

    Article  Google Scholar 

  28. Fairley R.E.: Software engineering concepts. McGraw Hill, New York (1985)

    Google Scholar 

  29. Sackman, H., et al.: Exploratory experimental studies comparing online and offline programming performance. Communications of the ACM 11(1) (1968)

  30. DeMarco T., Lister T.: Peopleware, productive projects and teams. Dorset House Publishing, New York (1987)

    Google Scholar 

  31. Amdahl, G.: Validity of the single processor approach to achieving large-scale computing capabilities. In: AFIPS conference proceedings pp. 483–485 (1967)

  32. Ballard J.C.: Computerized assessment of sustained attention: a review of factors affecting vigilance performance. J. Clin. Exp. Neuropsychol. 18(6), 843–863 (1996)

    Article  Google Scholar 

  33. Kieras, D.E., Meyer, D.E.: The role of cognitive task analysis in the application of predictive models of human performance. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.2570&rep=rep1&type=pdf (1998)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Bia.

Additional information

This study is a substantially revised and extended version of a paper (with the title Estimating Digitization Costs in Digital Libraries Using DiCoMo) originally appeared in the Proceedings of the 14th European Conference on Digital Libraries (ECDL 2010).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bia, A., Muñoz, R. & Gómez, J. DiCoMo: the digitization cost model. Int J Digit Libr 11, 141–153 (2010). https://doi.org/10.1007/s00799-011-0073-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-011-0073-9

Keywords

Navigation