[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

PRET19: Automatic Recognition and Indexing of Handwritten Loan Registers from 19th Century Parisian Universities

  • Conference paper
  • First Online:
Linking Theory and Practice of Digital Libraries (TPDL 2024)

Abstract

The PRET19 project aims to carry out a comprehensive analysis of the nineteenth-century loan registers of several Parisian university libraries. These historical documents provide invaluable insights into the circulation of books and the intellectual engagement of the academic community during a transformative period. By reconstructing the relationships and trends between borrowers, the project offers a unique perspective on the intellectual landscape of Parisian universities. In this first phase, the registers come from three different university libraries and exhibit a great diversity in layout, handwriting and content, which poses significant challenges for data processing. To address these challenges, we developed a document processing workflow that effectively combines automatic handwritten text recognition (HTR) with manual processing and validation. In addition, we provide a detailed description of the database, which is designed to comprehensively model the information extracted from the registries, ensuring that the data is structured in a way that adequately responds to anticipated research requests. Once completed, the processed data will be from winter 2024 accessible via a dedicated website, providing a comprehensive digital resource for historical research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 89.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 109.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.bis-sorbonne.fr/biu/spip.php?rubrique2.

  2. 2.

    https://www.collexpersee.eu/projet/es-lettres/.

  3. 3.

    https://www.collexpersee.eu/projet/pret19/.

  4. 4.

    http://comitehistoire.bnf.fr/recherches_en_course.

  5. 5.

    By “view”, we mean a non-blank-page.

  6. 6.

    See the lecture on borrowing and borrowers of mathematics by N. Verber and V. Rebolledo-Dhuin at the SFHST congress, 19 April 2023: https://hal.science/hal-04452449.

  7. 7.

    https://www.ifla.org/.

  8. 8.

    https://heuristnetwork.org/.

  9. 9.

    https://docs.ultralytics.com/tasks/.

  10. 10.

    https://www.elastic.co/elasticsearch.

  11. 11.

    At the following adress:https://pret19.bis-sorbonne.fr/, see too 6.2.

  12. 12.

    https://heurist.huma-num.fr/ScriptaManent/web/13822/36204.

References

  1. Béra, M.: Les emprunts de Durkheim dans les bibliothèques de l’École normale supérieure et de la Sorbonne, 1902–1917. Durkheimian Stud. 22(1), 3–46 (2016). https://doi.org/10.3167/ds.2016.220101

  2. Béra, M.: Taine indiscipliné, mais discipliné. Les voies de l’assignation disciplinaire d’un auteur dans les bibliothèques. Les Études Sociales 174(2), 115–150 (2021).https://doi.org/10.3917/etsoc.174.0115

  3. Bert, J.F., Lamy, J.: Voir les savoirs, Lieux, objets et gestes de la science. Anamosa, Paris (2021)

    Google Scholar 

  4. Blasselle, B.: Les lecteurs de la Bibliothèque nationale au XIX e siècle, L’apport des registres de prêt. Les Études Sociales 166(2), 69–88 (2017). https://doi.org/10.3917/etsoc.166.0069

  5. Blasselle, B., Blettner, S.: Lecteurs et emprunteurs à la Bibliothèque royale sous la monarchie de Juillet. Romantisme 177(3), 8–19 (2017)

    Google Scholar 

  6. Bobis, L., Noguès, B. (eds.): La bibliothèque de la Sorbonne. 250 ans d’histoire au cœur de l’université. No. 87 in Histoire de la France aux xixe et xxe siècles, Éditions de la Sorbonne, Paris (2021)

    Google Scholar 

  7. Boillet, M., Tarride, S., Schneider, Y., Abadie, B., Kesztenbaum, L., Kermorvant, C.: The Socface project: Large-scale collection, processing, and analysis of a century of French censuses. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR) (2024)

    Google Scholar 

  8. Chapron, E.: Les bibliothèques de Bernard de Montfaucon. In: Krings, V., Jestaz, J. (eds.) L’antiquité expliquée et représentée en figures "de Bernard de Montfaucon, Histoire d’un livre, No. 19 in Scripta receptoria, Ausonius éditions. Bordeaux Pessac (2021)

    Google Scholar 

  9. Chapron, E.: Les registres de prêt des bibliothèques: De l’histoire de la lecture à l’histoire des bibliothèques. Francia 48, 123–144 (2021https://doi.org/10.11588/fr.2021.1.93925

  10. Chapron, E.: La vie dans les papiers, Jean-François Séguier (1703–1784), No. 3. Heuristiques, Schwabe (2024)

    Google Scholar 

  11. Coquenet, D., Chatelain, C., Paquet, T.: DAN: a segmentation-free document attention network for handwritten document recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–17. Institute of Electrical and Electronics Engineers, IEEE (2023).https://doi.org/10.1109/tpami.2023.3235826

  12. Cura, R., Dumenieu, B., Abadie, N., Costes, B., Perret, J., Gribaudi, M.: Historical collaborative geocoding. ISPRS Int. J. Geo-Inf. 7(7), 262 (2018). https://doi.org/10.3390/ijgi7070262

  13. Graves, A., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. Adv. Neural Inf. Process. Syst. 21 (2008)

    Google Scholar 

  14. Grosicki, E., Carré, M., Geoffrois, E., Augustin, E., Preteux, F., Messina, R.: RIMES, complete (2024). https://doi.org/10.5281/zenodo.10812725

  15. International Federation of Library Associations and Institutions: Functional requirements for bibliographic records, final report, IFLA study group on the functional requirements for bibliographic records. Standing Committee of the IFLA Section on Cataloguing (1997)

    Google Scholar 

  16. Jacob, C. (ed.): Lieux de savoir, tome 1 : Espaces et communautés. Albin Michel, Paris (2007)

    Google Scholar 

  17. Jacob, C. (ed.): Lieux de savoir, tome 2 : Les mains de l’intellect. Albin Michel, Paris (2010)

    Google Scholar 

  18. Kahle, P., Colutto, S., Hackl, G., Mühlberger, G.: Transkribus - a service platform for transcription, recognition and retrieval of historical documents. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 04, pp. 19–24 (2017)

    Google Scholar 

  19. Kermorvant, C., Bardou, E., Blanco, M., Abadie, B.: Callico: a versatile open-source document image annotation platform. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR) (2024)

    Google Scholar 

  20. Kiessling, B., Tissot, R., Stokes, P., Stökl Ben Ezra, D.: eScriptorium: an open source platform for historical document analysis. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 2, pp. 19–19 (2019)

    Google Scholar 

  21. Sembel, N.: Les emprunts de Mauss à la bibliothèque universitaire de Bordeaux : la genèse d’une imagination sociologique . Durkheimian Stud. 21(1), 3–60 (2015). https://doi.org/10.3167/ds.2015.210101

  22. Sembel, N., Béra, M.: Emprunts de Durkheim à la bibliothèque universitaire de Bordeaux : 1889-1902. Durkheimian Stud. 19(1), 49–71 (2013). https://doi.org/10.3167/ds.2013.190103

  23. Tarride, S., et al.: Large-scale genealogical information extraction from handwritten Quebec parish records. IJDAR 26(3), 255–272 (2023). https://doi.org/10.1007/s10032-023-00427-w

  24. Tarride, S., Schneider, Y., Generali-Lince, M., Boillet, M., Abadie, B., Kermorvant, C.: Improving automatic text recognition with language models in the Pylaia open-source library. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR) (2024)

    Google Scholar 

  25. Tarride, S., Boillet, M., Kermorvant, C.: Key-Value Information Extraction from Full Handwritten Pages. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds.) ICDAR 2023, pp. 185–204. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41679-8_11

    Chapter  Google Scholar 

  26. TEKLIA: Arkindex: a document processing platform (2024). https://doc.arkindex.org/

  27. Vinciarelli, A., Bengio, S., Bunke, H.: Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans. Pattern Anal. Mach. Intell. 26(6), 709–720 (2004)

    Google Scholar 

  28. Waquet, F.: L’ordre matériel du savoir, Comment les savants travaillent (xvie-xxie siècles). CNRS Éditions, Paris (2015). https://www.cnrseditions.fr/catalogue/histoire/lordre-materiel-du-savoir/

Download references

Acknowledgments

We thank the GIS CollEx-Persée for their financial support. The GIS CollEx-Persée, under the auspices of the French Ministry of Higher Education, Research, and Innovation (MESRI), has been instrumental in enhancing the accessibility of heritage documents for scientific research through a national network of library cooperation established in 2017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Viera Rebolledo-Dhuin .

Editor information

Editors and Affiliations

6 Appendices

6 Appendices

1.1 6.1 Data model detailled

Presentation of the data model used in the project, detailing the relational structure among various entities associated with historical library loan registers. Key entities such as Prêt (Loan), Registre (Register), Vue (View), Exemplaire(s) (Copy(ies)), Œuvre (Work), Personne (Individual), Collectivité (Community), Localisation (Location), and Qualité (Profession) are interconnected to represent both source-derived information and enriched metadata.

figure a

1.2 6.2 Structure of the website

The website (currently under construction, in private mode, end due to go public in winter 2024), includes static pages (documentation) and database query pages.

figure b

The first two tabs (Accueil, Projet) present information on the project, the corpus concerned and the context in which the data was created. The 3rd, 4th, 5th an 6th (Personnes, Collectivités, Transactions, Sources) are used to query the database. The “visitor” can search via facets and export the results in CSV or JSON formats, for example. The website’s ergonomics – wich owes much to the Scripta Manent projectFootnote 12, also produced under Heurist – feature pop-ups that allow you to navigate from a search in one or more related entities, without losing legibility. So, in the PRET19 project, pop-ups are used to display the sources (original register pages) in Mirador, along with the records in the database. Finally, the last tab (Documentation) - a static page - provides documentation on lending in each of the libraries throughout the 19th century: regulations, background information and statistics on the type of borrower according to origin (Faculty of Science or Literature), status (professor or student), etc.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Périssier, L., Rebolledo-Dhuin, V., Petiot, MT., Schneider, Y., Kermorvant, C. (2024). PRET19: Automatic Recognition and Indexing of Handwritten Loan Registers from 19th Century Parisian Universities. In: Antonacopoulos, A., et al. Linking Theory and Practice of Digital Libraries. TPDL 2024. Lecture Notes in Computer Science, vol 15177. Springer, Cham. https://doi.org/10.1007/978-3-031-72437-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72437-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72436-7

  • Online ISBN: 978-3-031-72437-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics