To read this content please select one of the options below:

Smart combination of web measures for solving semantic similarity problems

Jorge Martinez‐Gil (Department of Computer Languages and Computing Sciences, University of Málaga, Málaga, Spain)

José F. Aldana‐Montes (Department of Computer Languages and Computing Sciences, University of Málaga, Málaga, Spain)

Online Information Review

ISSN: 1468-4527

Article publication date: 21 September 2012

Downloads

442

Abstract

Purpose

–

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Design/methodology/approach

–

In this article, the authors propose an optimization environment to improve existing techniques that use the notion of co‐occurrence and the information available on the web to measure similarity between terms.

Findings

–

The experimental results using the Miller and Charles and Gracia and Mena benchmark datasets show that the proposed approach is able to outperform classic probabilistic web‐based algorithms by a wide margin.

Originality/value

–

This paper presents two main contributions. The authors propose a novel technique that beats classic probabilistic techniques for measuring semantic similarity between terms. This new technique consists of using not only a search engine for computing web page counts, but a smart combination of several popular web search engines. The approach is evaluated on the Miller and Charles and Gracia and Mena benchmark datasets and compared with existing probabilistic web extraction techniques.

Keywords

Citation

Martinez‐Gil, J. and Aldana‐Montes, J.F. (2012), "Smart combination of web measures for solving semantic similarity problems", Online Information Review, Vol. 36 No. 5, pp. 724-738. https://doi.org/10.1108/14684521211276000

Publisher

Emerald Group Publishing Limited

To read this content please select one of the options below:

Please note you do not have access to teaching notes

Smart combination of web measures for solving semantic similarity problems

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords

Citation

Publisher

Related articles

To read this content please select one of the options below:

Please note you do not have access to teaching notes

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords

Citation

Publisher

Related articles

All feedback is valuable

Report an issue or find answers to frequently asked questions