[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to content
BY-NC-ND 4.0 license Open Access Published by De Gruyter October 18, 2016

Automatic extraction of microorganisms and their habitats from free text using text mining workflows

  • BalaKrishna Kolluru EMAIL logo , Sirintra Nakjang , Robert P. Hirt , Anil Wipat and Sophia Ananiadou

Summary

In this paper we illustrate the usage of text mining workflows to automatically extract instances of microorganisms and their habitats from free text; these entries can then be curated and added to different databases. To this end, we use a Conditional Random Field (CRF) based classifier, as part of the workflows, to extract the mention of microorganisms, habitats and the inter-relation between organisms and their habitats.

Results indicate a good performance for extraction of microorganisms and the relation extraction aspects of the task (with a precision of over 80%), while habitat recognition is only moderate (a precision of about 65%). We also conjecture that pdf-to-text conversion can be quite noisy and this implicitly affects any sentence-based relation extraction algorithms.

Published Online: 2016-10-18
Published in Print: 2011-6-1

© 2011 The Author(s). Published by Journal of Integrative Bioinformatics.

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Downloaded on 31.12.2024 from https://www.degruyter.com/document/doi/10.1515/jib-2011-184/html
Scroll to top button