GeTFIRST: ontology-based keyword search towards semantic disambiguation
International Journal of Web Information Systems
ISSN: 1744-0084
Article publication date: 16 November 2015
Abstract
Purpose
This paper aims to improve the semantic-disambiguation capability of an information-retrieval system by taking advantages of a well-crafted classification tree. The unstructured nature and sheer volume of information accessible over networks have made it drastically difficult for users to seek relevant information. Many information-retrieval methods have been developed to address this problem, and keyword-based approach is amongst the most common approach. Such an approach is often inadequate to cope with the conceptualization associated with user needs and contents. This brings about the problem of semantic ambiguation that refers to the disagreement in meaning of terms between involving parties of a communication due to polysemy, leading to increased complexity and lesser accuracy in information integration, migration, retrieval and other related activities.
Design/methodology/approach
A novel ontology-based search approach, named GeTFIRST (short for Graph-embedded Tree Fostering Information Retrieval SysTem), is proposed to disambiguate keywords semantically. The contribution is twofold. First, a search strategy is proposed to prune irrelevant concepts for accuracy improvement using our Graph-embedded Tree (GeT)-based ontology. Second, a path-based ranking algorithm is proposed to incorporate and reward the content specificity.
Findings
An empirical evaluation was performed on United States Patent And Trademark Office (USPTO) patent datasets to compare our approach with full-text patent search approaches. The results showed that GeTFIRST handled the ambiguous keywords with higher keyword-disambiguation accuracy than traditional search approaches.
Originality/value
The search approach of this paper copes with the semantic ambiguation by using our proposed GeT-based ontology and a path-based ranking algorithm.
Keywords
Acknowledgements
This research is funded by International University VNUHCM under the grant number SV2014-IT-01.
Citation
Nguyen, H.-M., Nguyen, H.-Q., Tran, K.-N. and Vo, X.-V. (2015), "GeTFIRST: ontology-based keyword search towards semantic disambiguation", International Journal of Web Information Systems, Vol. 11 No. 4, pp. 442-467. https://doi.org/10.1108/IJWIS-06-2015-0019
Publisher
:Emerald Group Publishing Limited
Copyright © 2015, Emerald Group Publishing Limited