Abstract
The problem of matching schemas or ontologies consists of providing corresponding entities in two or more knowledge models that belong to a same domain but have been developed separately. Nowadays there are a lot of techniques and tools for addressing this problem, however, the complex nature of the matching problem make existing solutions for real situations not fully satisfactory. The Google Similarity Distance has appeared recently. Its purpose is to mine knowledge from the Web using the Google search engine in order to semantically compare text expressions. Our work consists of developing a software application for validating results discovered by schema and ontology matching tools using the philosophy behind this distance. Moreover, we are interested in using not only Google, but other popular search engines with this similarity distance. The results reveal three main facts. Firstly, some web search engines can help us to validate semantic correspondences satisfactorily. Secondly there are significant differences among the web search engines. And thirdly the best results are obtained when using combinations of the web search engines that we have studied.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Berners-Lee T, Hendler J, Lassila O (2001) The Semantic Web. Scientific American 284(5):34–43
Euzenat J, Shvaiko P. Ontology Matching, Springer, 2007.
Kiefer C, Bernstein A, Stocker M. The fundamentals of iSPARQL: A virtual triple approach for similarity-based semantic web tasks. In Proc. ISWC/ASWC, Nov. 2007, pp.295–309.
Ziegler P, Kiefer C, Sturm C, Dittrich K R, Bernstein A. Detecting similarities in ontologies with the SOQA-SimPack toolkit. In Proc. the 10th EDBT, March 2006, pp.59–76.
Lambrix P, Tan H (2007) A tool for evaluating ontology alignment strategies. J Data Semantics 8:182–202
Domshlak C, Gal A, Roitman H (2007) Rank aggregation for automatic schema matching. IEEE Trans Knowl Data Eng 19(4):538–553
Gal A, Anaby-Tavor A, Trombetta A, Montesi D (2005) A framework for modeling and evaluating automatic semantic reconciliation. VLDB Journal 14(1):50–67
Ehrig M, Staab S, Sure Y. Bootstrapping ontology alignment methods with APFEL. In Proc. the 4th International Semantic Web Conference, Nov. 2005, pp.186–200.
Lee Y, Sayyadian M, Doan A, Rosenthal AS (2007) eTuner: Tuning schema matching software using synthetic scenarios. VLDB Journal 16(1):97–122
Mao M, Peng Y, Spring M (2010) An adaptive ontology mapping approach with neural network based constraint satisfaction. J Web Semantics 8(1):14–25
Wang J, Ding Z, Jiang C. GAOM: Genetic algorithm based ontology matching. In Proc. APSCC, Dec. 2006, pp.617–620.
Ernandes M, Angelini G, Gori M. WebCrow: A web-based system for crossword solving. In Proc. the 20th AAAI, July 2005, pp.1412–1417.
Gracia J, Mena E. Web-based measure of semantic relatedness. In Proc. the 9th WISE, Sept. 2008, pp.136–150.
Cilibrasi RL, Vitányi PMB (2007) The google similarity distance. IEEE Trans Knowledge and Data Engineering 19(3):370–383
Budanitsky A, Hirst G (2006) Evaluating word Net-based measures of lexical semantic relatedness. Computational Linguistics 32(1):13–47
Motta E, Sabou M. Next generation semantic web applications. In Proc. the 1st ASWC, Sept. 2006, pp.24–29.
Do H H, Rahm E. COMA — A system for flexible combination of schema matching approaches. In Proc. the 28th VLDB, August 2002, pp.610–621.
Aumueller D, Do H H, Massmann S, Rahm E. Schema and ontology matching with COMA++. In Proc. the 24th SIGMOD Conference, June 2005, pp.906–908.
Drumm C, Schmitt M, Do H H, Rahm E. Quickmig: Automatic schema matching for data migration projects. In Proc. the 16th CIKM, Nov. 2007, pp.107–116.
Ehrig M, Sure Y. FOAM — Framework for ontology alignment and mapping - results of the ontology alignment evaluation initiative. In Proc. Integrating Ontologies, Oct. 2005, pp.72–76.
Wang Z, Zhang X, Hou L, Zhao Y, Li J, Qi Y, Tang J. Ri-MOM results for OAEI 2010. In Proc. the 15th OM, Nov. 2010.
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88
Miller GA (1995) WordNet: A lexical database for English. Commun ACM 38(11):39–41
Martinez-Gil J, Aldana-Montes JF (2011) Evaluation of two heuristic approaches to solve the ontology meta-matching problem. Knowl Inf Syst 26(2):225–247
Avesani P, Giunchiglia F, Yatskevich M. A large scale taxonomy mapping evaluation. In Proc. the 4th International Semantic Web Conference, Nov. 2005, pp.67–81.
Euzenat J, Meilicke C, Stuckenschmidt H, Shvaiko P, Trojahn C (2011) Ontology alignment evaluation initiative: Six years of experience. J Data Semantics 15:158–192
Shvaiko P, Euzenat J, Giunchiglia F, He B (eds.) Proceedings of the 2nd InternationalWorkshop on Ontology Matching Busan, Korea, November 11, 2007.
van Harmelen F. Two obvious intuitions: Ontology-mapping needs background knowledge and approximation. In Proc. IAT, Dec. 2006, p.11.
Giunchiglia F, Shvaiko P, Yatskevich M. Discovering missing background knowledge in ontology matching. In Proc. the 17th ECAI, Aug. 29-Sept. 1, 2006, pp.382–386.
Vazquez R, Swoboda N. Combining the semantic web with the web as background knowledge for ontology mapping. In Proc. OTM, Nov. 2007, 1: 814–831.
Gligorov R, ten Kate W, Aleksovski Z, van Harmelen F. Using google distance to weight approximate ontology matches. In Proc. the 16th WWW, May 2007, pp.767–776.
Miller GA, Charles WG (1991) Contextual correlates of semantic similarity. Language Cognitive Processes 6(1):1–28
Rubenstein H, Goodenough JB (1965) Contextual correlates of synonymy. Commun ACM 8(10):627–633
Keller F, Lapata M (2003) Using the Web to obtain frequencies for unseen bigrams. Computational Linguistics 29(3):459–484
Resnik P, Smith NA (2003) The Web as a parallel corpus. Computational Linguistics 29(3):349–380
Turney P D. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. CoRR, 2002, cs.LG/0212033.
Matsuo Y, Sakaki T, Uchiyama K, Ishizuka M. Graph-based word clustering using a web search engine. In Proc. EMNLP, July 2006, pp.542–550.
Sahami M, Heilman T D. A web-based kernel function for measuring the similarity of short text snippets. In Proc. the 15th WWW, May 2006, pp.377–386.
Chen H H, Lin M S, Wei Y C. Novel association measures using web search with double checking. In Proc. ACL, July 2006.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by Spanish Ministry of Innovation and Science through REALIDAD: Gestion, Analisis y Explotacion Efficiente de Datos Vinculados under Grant No. TIN2011-25840.
Rights and permissions
About this article
Cite this article
Martinez-Gil, J., Aldana-Montes, J.F. KnoE: A Web Mining Tool to Validate Previously Discovered Semantic Correspondences. J. Comput. Sci. Technol. 27, 1222–1232 (2012). https://doi.org/10.1007/s11390-012-1298-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-012-1298-9