Abstract
One of the major challenges in the post genomic era consists in exploiting the vast amounts of biological data stored in the numerous heterogeneous biological databases distributed worldwide. Most research projects in bioinformatics start with data retrieval from selected sources. However, identifying appropriate data sources is not trivial and requires the representation of the knowledge about data sources. We present here the BioRegistry project which aims at providing means to represent and exploit knowledge associated with biological databases. As a first step, a repository structure has been designed to organise metadata associated with databases consisting of five metadata categories: database identification, topics covered, quality information, access/availability, and tracking of the metadata. The BioRegistry model and its relationships with the DCMI (Dublin Core Metadata Initiative) are described. Prototypes with various functionalities to feed, maintain and exploit the repository are presented.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Galperin, M.Y.: The Molecular Biology Database Collection: 2005 update. Nucleic Acids Research 33 (2005). National Center for Biotechnology Information and National Library of Medicine and National Institutes of Health
Frawley, W.J., Piatetsky-Shapiro, G., Matheus, C.J.: Knowledge discovery in databases: An overview. In: Knowledge Discovery in Databases, pp. 1–30. AAAI/MIT Press, Cambridge (1991)
Davidson, S.B., Overton, G.C., Buneman, P.: Challenges in Integrating Biological Data Sources. Journal of Computational Biology 2, 557–572 (1995)
Karp, P.D.: A strategy for database interoperation. Journal of Computational Biology 2, 573–586 (1995)
Markowitz, V.M.: Heterogeneous molecular biology databases. Journal of Computational Biology 2, 537–538 (1995)
Kohler, J., Philippi, S., Lange, M.: SEMEDA: ontology based semantic integration of biological databases. Bioinformatics 19, 2420–2427 (2003)
Eckman, B.A., Kosky, A.S., Leonardo, A., Laroco, J.: Extending traditional query-based integration approaches for functional characterization of post-genomic data. Bioinformatics 17, 587–601 (2001)
Buttler, D., Coleman, M., Critchlow, T., Fileto, R., Han, W., Pu, C., Rocco, D., Xiong, L.: Querying Multiple Bioinformatics Information Sources: Can Semantic Web Research Help? SIGMOD Record 31, 59–64 (2002)
Wroe, C., Stevens, R., Goble, C., Roberts, A., Greenwood, M.: A suite of DAML+OIL Ontologies to Describe Bioinformatics Web Services and Data. International Journal of Cooperative Information Systems 12, 197–224 (2003)
Oinn, T., Addis, M., Ferris, J., Marvin, D., Greenwood, M., Carver, T., Pocock, M., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 3045–3054 (2004)
Goble, C.A., Stevens, R., Ng, G., Bechhofer, S., Paton, N.W., Baker, P.G., Peim, M., Brass, A.: Transparent Access to Multiple Bioinformatics Information Sources. IBM Systems Journal 40, 532–551 (2001)
Shaker, R., Mork, P., Brockenbrough, J., Donelson, L., Tarczy-Hornoch, P.: The biomediator system as a tool for integrating biological databases on the web. In: Proceedings of the Workshop on Information Integration on the Web (held in conjunction with VLDB 2004), Toronto (2004)
Lacroix, Z., Boucelma, O., Essid, M.: The biological integration system. In: WIDM 2003: Proceedings of the 5th ACM international workshop on Web information and data management, pp. 45–49. ACM Press, New York (2003)
Freier, A., Hofestädt, R., Lange, M., Scholz, U., Stephanik, A.: Biodataserver: A sql-based service for the online integration of life science data. Silico Biology 2, 5 (2002)
Boulakia, S.C., Lair, S., Stransky, N., Graziani, S., Radvanyi, F., Barillot, E., Froidevaux, C.: Selecting biomedical data sources according to user preferences. Bioinformatics 20, i86–i93 (2004)
Discala, C., Benigni, X., Barillot, E., Vaysseix, G.: DBCAT: a catalog of 500 biological databases. Nucleic Acids Research 28, 8–9 (2000)
Lord, P., Bechhofer, S., Wilkinson, M.D., Schiltz, G., Gessler, D., Hull, D., Goble, C., Stein, L.: Applying semantic web services to bioinformatics: Experiences gained, lessons learnt. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 350–364. Springer, Heidelberg (2004)
Lord, P., Wroe, C., Stevens, R., Goble, C., Miles, S., Moreau, L., Decker, K., Payne, T., Papay, J.: Semantic and personalised service discovery. In: Cheung, W., Ye, Y. (eds.) WI/IAT 2003 workshop on Knowledge Grid and Grid Intelligence, Halifax, Canada, pp. 100–107 (2003)
Oinn, T., Addis, M., Ferris, J., Marvin, G., Greenwood, M., Carver, T., Wipat, A., Li, P.: Taverna, lessons in creating a workflow environment for the life science. In: Proceedings of GCF Workflow Workshop, Berlin (2004)
Dekkers, M., Weibel, S.: State of the dublin core metadata initiative. D-Lib Magazine 9 (2003)
Bergmann, R.: Highlights of the european inreca projects. In: Proceedings of the 4th International Conference on Case-Based Reasoning, pp. 1–15 (2001)
Ganter, B., Wille, R.: Formal Concept Analysis. Mathematical Foundations. Springer, Heidelberg (1999)
Carpineto, C., Romano, G.: Concept Data Analysis: Theory and Applications. John Wiley & Sons, Chichester (2004)
Carpineto, C., Romano, G.: Order-theoretical ranking. Journal of the American Society for Information Science 51, 587–601 (2000)
Messai, N., Devignes, M.D., Napoli, A., Smaïl-Tabbone, M.: Treillis de concepts et ontologies pour l’interrogation d’un annuaire de sources de données biologiques (bioregistry). In: 18ème Congrès INFORSID 2005, Grenoble (2005)
Messai, N., Devignes, M.D., Napoli, A., Smaïl-Tabbone, M.: Querying a bioinformatic data sources registry with concept lattices. In: Proceedings of the 13th International Conference on Conceptual Structures (ICCS 2005) Conceptual Structures: Common Semantics for Sharing Knowledge, Kassel, Germany (2005)
Safar, B., Kefi, H., Reynaud, C.: OntoRefiner, a user query refinement interface usable for Semantic Web Portals. In: Proceedings of Application of Semantic Web technologies to Web Communities, Workshop ECAI 2004, Valencia, Spain, pp. 65–79 (2004)
Messai, N.: Treillis de Galois et ontologies de domaine pour la classification et la recherche de sources de données génomiques. Rapport de dea informatique de lorraine, UHP-Nancy 1 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Smaïl-Tabbone, M., Osman, S., Messai, N., Napoli, A., Devignes, MD. (2005). BioRegistry: A Structured Metadata Repository for Bioinformatic Databases. In: R. Berthold, M., Glen, R.C., Diederichs, K., Kohlbacher, O., Fischer, I. (eds) Computational Life Sciences. CompLife 2005. Lecture Notes in Computer Science(), vol 3695. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11560500_5
Download citation
DOI: https://doi.org/10.1007/11560500_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29104-6
Online ISBN: 978-3-540-31726-5
eBook Packages: Computer ScienceComputer Science (R0)