Abstract
Semantic similarity measurement aims to automatically compute the degree of similarity between two textual expressions that use different representations for naming the same concepts. However, very short textual expressions cannot always follow the syntax of a written language and, in general, do not provide enough information to support proper analysis. This means that in some fields, such as the processing of landmarks and points of interest, results are not entirely satisfactory. In order to overcome this situation, we explore the idea of aggregating existing methods by means of two novel aggregation operators aiming to model an appropriate interaction between the similarity measures. As a result, we have been able to improve the results of existing techniques when solving the GeReSiD and the SDTS, two of the most popular benchmark datasets for dealing with geographical information.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ahlgren, P., Jarneving, B., Rousseau, R. (2003). Requirements for a cocitation similarity measure, with special reference to pearson’s correlation coefficient. JASIST, 54(6), 550–560.
Amir, S., Tanasescu, A., Zighed, D.A. (2017). Sentence similarity based on semantic kernels for intelligent text retrieval. Journal of Intelligent Information System, 48(3), 675–689.
Aouicha, M.B., & Taieb, M.A.H. (2015). G2WS: gloss-based wordnet and wiktionary semantic similarity measure. In 12Th IEEE/ACS international conference of computer systems and applications, AICCSA 2015, november 17-20, 2015 (pp. 1–7). Marrakech.
Ballatore, A., Bertolotto, M., Wilson, D.C. (2013). Geographic knowledge extraction and semantic similarity in openstreetmap. Knowledge and Information Systems, 37(1), 61–81.
Ballatore, A., Bertolotto, M., Wilson, D.C. (2014). An evaluative baseline for geo-semantic relatedness and similarity. GeoInformatica, 18(4), 747–767.
Ballatore, A., Wilson, D.C., Bertolotto, M. (2013). Computing the semantic similarity of geographic terms using volunteered lexical definitions. International Journal of Geographical Information Science, 27(10), 2099–2118.
Buscaldi, D., Roux, J.L., Flores, J.J.G., Popescu, A. (2013). LIPN-CORE: Semantic text similarity using n-grams, wordnet, syntactic analysis, ESA and information retrieval based features. In Proceedings of the Second Joint Conference on Lexical and Computational Semantics, *SEM 2013, June 13-14, 2013 (pp. 162–168). Atlanta.
Chaves-González, J.M., & Martinez-Gil, J. (2013). Evolutionary algorithm based on different semantic similarity functions for synonym recognition in the biomedical domain. Knowledge-Based Systems, 37, 62–69.
Feng, C., & Flewelling, D.M. (2004). Assessment of semantic similarity between land use/land cover classification systems. Computers, Environment and Urban Systems, 28(3), 229–246.
Grabisch, M., Marichal, J., Mesiar, R., Pap, E. (2011). Aggregation functions: Construction methods, conjunctive, disjunctive and mixed classes. Information Sciences, 181(1), 23–43.
Han, L., Finin, T., McNamee, P., Joshi, A., Yesha Y. (2013). Improving word similarity by augmenting PMI with estimates of word polysemy. IEEE Transactions on Knowledge and Data Engineering, 25(6), 1307–1322.
Hobel, H., Fogliaroni, P., Frank, A.U. (2016). Deriving the geographic footprint of cognitive regions. in Geospatial Data in a Changing World - Selected Papers of the 19th AGILE Conference on Geographic Information Science, 14-17 june 2016 (pp 67–84). Helsinki.
Hsu, H., & Chen, C. (1996). Aggregation of fuzzy opinions under group decision making. Fuzzy Sets and Systems, 79(3), 279–285.
Janowicz, K., Raubal, M., Kuhn, W. (2011). The semantics of similarity in geographic information retrieval. Journal Spatial Information Science, 2(1), 29–57.
Janowicz, K., Raubal, M., Schwering, A., Kuhn, W. (2008). Semantic similarity measurement and geospatial applications. Transactions in GIS, 12(6), 651–659.
Ji, Q., Haase, P., Qi, G. (2011). Combination of similarity measures in ontology matching using the OWA operator. In Recent Developments in the Ordered Weighted Averaging Operators: Theory and Practice (pp. 281–295).
Kuncheva, L. (2001). Using measures of similarity and inclusion for multiple classifier fusion by decision templates. Fuzzy Sets and Systems, 122(3), 401–407.
Landauer, T.K., & Psotka, J. (2000). Simulating text understanding for educational applications with latent semantic analysis: Introduction to LSA. Interactive Learning Environments, 8(2), 73–86.
Lastra-Díaz, J.J., & García-Serrano, A. (2015a). A new family of information content models with an experimental survey on wordnet. Knowledge-Based Systems, 89, 509–526.
Lastra-Díaz, J.J., & García-Serrano, A. (2015b). A novel family of ic-based similarity measures with a detailed experimental survey on wordnet. Engineering Applications of AI, 46, 140–153.
Li, X., Cong, G., Li, X., Pham, T.N., Krishnaswamy, S. (2015). Rank-geofm: A ranking based geographical factorization method for point of interest recommendation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, August 9-13, 2015 (pp. 433–442). Santiago.
Li, Y., McLean, D., Bandar, Z., O’Shea, J., Crockett, K.A. (2006). Sentence similarity based on semantic nets and corpus statistics. IEEE Transactions on Knowledge and Data Engineering, 18(8), 1138–1150.
Lim, K.H., Chan, J., Leckie, C., Karunasekera, S. (2015). Personalized tour recommendation based on user interests and points of interest visit durations. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, July 25-31, 2015 (pp. 1778–1784). Buenos Aires.
Lin, D. (1998). An information-theoretic definition of similarity. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), July 24-27, 1998 (pp. 296–304). Madison.
Martinez-Gil, J. (2014). An overview of textual semantic similarity measures based on web intelligence. Artificial Intelligence Review, 42(4), 935–943.
Martinez-Gil, J. (2016a). Accurate semantic similarity measurement of biomedical nomenclature by means of fuzzy logic. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 24(2), 291–306.
Martinez-Gil, J. (2016b). Coto: a novel approach for fuzzy aggregation of semantic similarity measures. Cognitive Systems Research, 40, 8–17.
Martinez-Gil, J., & Chaves-Gonzalez, J.M. (2019). Automatic design of semantic similarity controllers based on fuzzy logics. Expert Systems with Applications, 131, 45–59.
Medina-Hernández, J.A., Gomez-castañeda, F., Moreno-Cadenas, J.A. (2009). An evolving fuzzy neural network based on the mapping of similarities. IEEE Transactions on Fuzzy Systems, 17(6), 1379–1396.
Miller, G., & Charles, W. (1991). Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1), 1–28.
Musavi, M.T., Kalantri, K., Ahmed, W., Chan, K.H. (1993). A minimum error neural network (MNN). Neural Networks, 6(3), 397–407.
Pilehvar, M.T., & Navigli, R. (2015). From senses to texts: An all-in-one graph-based approach for measuring semantic similarity. Artificial Intelligence, 228, 95–128.
Pirró, G. (2009). A semantic similarity metric combining features and intrinsic information content. Data and Knowledge Engineering, 68(11), 1289–1308.
Ranjbar, N., Mashhadirajab, F., Shamsfard, M., pour, R.H., Pour, A.V. (2017). Mahtab at semeval-2017 task 2 Combination of corpus-based and knowledge-based methods to measure semantic word similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval@ACL 2017, August 3-4, 2017 (pp. 256–260). Vancouver.
Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI 95, Montréal Québec, Canada, August 20-25 1995, (Vol. 2 pp. 448–453).
Rodríguez, M.A., & Egenhofer, M.J. (2003). Determining semantic similarity among entity classes from different ontologies. IEEE Transactions on Knowledge and Data Engineering, 15(2), 442–456.
Rus, V., Lintean, M.C., Banjade, R., Niraula, N.B., Stefanescu, D. (2013). SEMILAR: The semantic similarity toolkit. In 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013, Proceedings of the Conference System Demonstrations, 4-9 August 2013 (pp. 163–168 ). Sofia.
Rybinski, M., & Aldana-Montes, J.F. (2017). Domesa: a novel approach for extending domain-oriented lexical relatedness calculations with domain-specific semantics. Journal of Intelligent Information System, 49(3), 315–331.
Setiono, R. (2001). Generating linear regression rules from neural networks using local least squares approximation. In Connectionist Models of Neurons, Learning Processes and Artificial Intelligence, 6th International Work-conference on Artificial and Natural Neural Networks, IWANN 2001 granada, spain, june 13-15, 2001, proceedings, Part I (pp. 277–284).
Turney, P.D. (2013). Distributional semantics beyond words: Supervised learning of analogy and paraphrase. TACL, 1, 353–366.
Webb, G.I., & Zheng, Z. (2004). Multistrategy ensemble learning: Reducing error by combining ensemble learning techniques. IEEE Transactions on Knowledge and Data Engineering, 16(8), 980–991.
Zhang, P., Zhang, Z., Zhang, W., Wu, C. (2014). Semantic similarity computation based on multi-feature combination using hownet. JSW, 9(9), 2461–2466.
Acknowledgments
We would like to thank the anonymous reviewers for their helpful and constructive comments that greatly contributed to improve this work. The research reported in this paper has been supported by the Austrian Ministry for Transport, Innovation and Technology, the Federal Ministry of Science, Research and Economy, and the Province of Upper Austria under the frame of the COMET Center SCCH [FFG: 844597].
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Martinez-Gil, J. Semantic similarity aggregators for very short textual expressions: a case study on landmarks and points of interest. J Intell Inf Syst 53, 361–380 (2019). https://doi.org/10.1007/s10844-019-00561-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-019-00561-0