Abstract
Flexible digital library systems need to be able to accept, or “import,” documents and metadata in a variety of forms, and associate metadata with the appropriate documents. This paper analyzes the requirements of the import process for general digital libraries. The requirements include (a) format conversion for source documents, (b) the ability to incorporate existing conversion utilities, (c) provision for metadata to be specified in the document files themselves and/or in separate metadata files, (d) format conversion for metadata files, (e) provision for metadata to be computed from the document content, and (f) flexible ways of associating metadata with documents or sets of documents. We argue that these requirements are so open-ended that they are best met by an extensible architecture that facilitates the addition of new document formats and metadata facilities to existing digital library systems. An implementation of this architecture is briefly described.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dumais, S.T., Platt, J., Heckerman, D. and Sahami, M.: Inductive learning algorithms and representations for text categorization. Proc ACM Conf on Information and Knowledge Management. (1998) 148–155
Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C. and Nevill-Manning, C.: Domain-specific keyphrase extraction. Proc Int Joint Conference on Artificial Intelligence, Stockholm, Sweden. San Francisco, CA: Morgan Kaufmann Publishers. (1999) 668–673
Lavoie, Brian. Meeting the Challenges of Digital Preservation: The OAIS Reference Model. OCLC Newsletter, No. 243. (2000) 26–30
Van de Sompel, H. and Lagoze, C.: The Santa Fe convention of the Open Archives Initiative. D-Lib Magazine, Vol 6, No 2. (2000)
Witten, I.H., Bainbridge, D. and Boddie, S.J.: Power to the people: end-user building of digital library collections. Proc Joint Conference on Digital Libraries, Roanoke, Virginia. (2001) 94–103
Witten, I.H., Bainbridge, D., Paynter, S. and Boddie, S.J.: The Greenstone plugin architecture. Proc Joint Conference on Digital Libraries, Portland, Oregon. (2002)
Yeates, S., Bainbridge, D. and Witten, I.H.: Using compression to identify acronyms in text. Proc Data Compression Conference, edited by J.A. Storer and M. Cohn. IEEE Press Los Alamitos, CA. (2000) 582
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Witten, I.H., Bainbridge, D., Paynter, G., Boddie, S. (2002). Importing Documents and Metadata into Digital Libraries: Requirements Analysis and an Extensible Architecture. In: Agosti, M., Thanos, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2002. Lecture Notes in Computer Science, vol 2458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45747-X_29
Download citation
DOI: https://doi.org/10.1007/3-540-45747-X_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44178-6
Online ISBN: 978-3-540-45747-3
eBook Packages: Springer Book Archive