Abstract
The BioASQ “Task on Large-Scale Online Biomedical Semantic Indexing” charges participants with assigning semantic tags to biomedical journal abstracts. We present a system that takes as input a biomedical abstract and uses latent semantic analysis to identify similar documents in the MEDLINE database. The system then uses a novel ranking scheme to select a list of MeSH tags from candidates drawn from the most similar documents. Our approach achieved better than baseline performance in both precision and recall. We suggest several possible strategies to improve the system’s performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aronson, A.R., Bodenreider, O., Chang, H.F., Humphrey, S.M., Mork, J.G., Nelson, S.J., Rindflesch, T.C., Wilbur, W.J.: The NLM indexing initiative. In: AMIA Annual Symposium Proceedings, pp. 17–21 (2000)
Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association : JAMIA 17(3), 229–236 (2010)
BioASQ: Test results for task 3a (2015). http://participants-area.bioasq.org/results/3a/
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th International Conference on Machine Learning, pp. 129–136. ACM (2007)
Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. JASIS 41(6), 391–407 (1990)
Furnas, G., Deerwester, S., Dumais, S., Landauer, T.K., Harshman, R., Streeter, L., Lochbaum, K.: Information retrieval using a singular value decomposition model of latent semantic structure. In: Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1988, May 1988
Huang, M., Névéol, A., Lu, Z.: Recommending mesh terms for annotating biomedical articles. Journal of the American Medical Informatics Association 18(5), 660–667 (2011)
Jimeno Yepes, A., Mork, J.G., Wilkowski, B., Demner-Fushman, D., Aronson, A.R.: MEDLINE MeSH indexing: lessons learned from machine learning and future directions. In: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium, pp. 737–742. ACM, New York (2012)
Kiss, T., Strunk, J.: Unsupervised Multilingual Sentence Boundary Detection. Computational Linguistics 32(4), 485–525 (2006)
Lin, J., DiCuccio, M., Grigoryan, V., Wilbur, W.: Navigating information spaces: A case study of related article search in PubMed. Information Processing and Management 44(5), 1771–1783 (2008)
Littman, M.L., Dumais, S.T., Landauer, T.K.: Automatic cross-language information retrieval using latent semantic indexing. In: Grefenstette, G. (ed.) Cross-Language Information Retrieval: The Spring International Series on Information Retrieval, pp. 51–62. Springer (1998)
National Library of Medicine: The medline indexing process: Determining subject content (2015). http://www.nlm.nih.gov/bsd/disted/meshtutorial/principlesofmedlinesubjectindexing/theindexingprocess/
Partalas, I., Gaussier, É., Ngomo, A.C.N.: Results of the first bioasq workshop. In: BioASQ@ CLEF, pp. 1–8 (2013)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), November 1975
Stevenson, M., Guo, Y., Al Amri, A., Gaizauskas, R.: Disambiguation of biomedical abbreviations. In: Proc. Workshop Current Trends in Biomedical Natural Language Processing, pp. 71–79 (2009)
Tsatsaronis, G., Balikas, G., Malakasiotis, P., Partalas, I., Zschunke, M., Alvers, M.R., Weissenborn, D., Krithara, A., Petridis, S., Polychronopoulos, D., et al.: An overview of the bioasq large-scale biomedical semantic indexing and question answering competition. BMC bioinformatics 16(1), 138 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Adams, J.R., Bedrick, S. (2015). Automatic Indexing of Journal Abstracts with Latent Semantic Analysis. In: Mothe, J., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2015. Lecture Notes in Computer Science(), vol 9283. Springer, Cham. https://doi.org/10.1007/978-3-319-24027-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-24027-5_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24026-8
Online ISBN: 978-3-319-24027-5
eBook Packages: Computer ScienceComputer Science (R0)