[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2557500.2557534acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach

Published: 24 February 2014 Publication History

Abstract

The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying heterogeneous and distributed third-party databases can create major barriers for data consumers. At the core of this problem is the semantic gap between the way users express their information needs and the representation of the data. This work aims to provide a natural language interface and an associated semantic index to support an increased level of vocabulary independency for queries over Linked Data/Semantic Web datasets, using a distributional-compositional semantics approach. Distributional semantics focuses on the automatic construction of a semantic model based on the statistical distribution of co-occurring words in large-scale texts. The proposed query model targets the following features: (i) a principled semantic approximation approach with low adaptation effort (independent from manually created resources such as ontologies, thesauri or dictionaries), (ii) comprehensive semantic matching supported by the inclusion of large volumes of distributional (unstructured) commonsense knowledge into the semantic approximation process and (iii) expressive natural language queries. The approach is evaluated using natural language queries on an open domain dataset and achieved avg. recall=0.81, mean avg. precision=0.62 and mean reciprocal rank=0.49.

References

[1]
Heath, T. and Bizer, C. Linked Data: Evolving the Web into a Global Data Space (1st edition). Synthesis Lectures on the Semantic Web, (2011), 1--136.
[2]
1st Workshop on Question Answering over Linked Data (QALD-1), http://www.sc.cit-ec.uni-bielefeld.de/qald-1, (2011).
[3]
Gabrilovich, E. and Markovitch, S. Computing semantic relatedness using Wikipedia-based explicit semantic analysis, in Proc. Intl. Joint Conf. On Artificial Intelligence, (2007), pp. 1606--1611.
[4]
Freitas, A., Curry, E., Oliveira, J. G., O'Riain, S. A Distributional Structured Semantic Space for Querying RDF Graph Data. International Journal of Semantic Computing (IJSC), (2012), vol. 5, no. 4, pp. 433--462.
[5]
Freitas, A., Oliveira, J.G., O'Riain, S., Curry, E. and Pereira da Silva, J.C. Querying Linked Data using Semantic Relatedness: A Vocabulary Independent Approach, in Proc. of the 16th Intl. Conf. on Applications of Natural Language to Information Systems, NLDB, (2011), vol. 6716, pp. 40--51.
[6]
Turney, P. D. and Pantel, P. From Frequency to Meaning: Vector Space Models of Semantics, Journal of Artificial Intelligence Research, (2010), vol. 37, pp. 141--188.
[7]
Freitas, A., Curry, E. and O'Riain, S. A Distributional Approach for Terminological Semantic Search on the Linked Data Web, in Proc. of the 27th ACM Symposium On Applied Computing (SAC), (2012).
[8]
Freitas, A., Curry, E., Oliveira, J. G. and O'Riain, S. Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches and Trends. IEEE Internet Computing, Special Issue on Internet-Scale Data, (2012).
[9]
Lopez, V., Motta, E., Uren, V. PowerAqua: Fishing the Semantic Web. The Semantic Web: Research and Applications, Lecture Notes in Computer Science (2006), vol. 4011, pp. 393--410.
[10]
Damljanovic, D., Agatonovic, M. and Cunningham, H. FREyA: An Interactive Way of Querying Linked Data Using Natural Language, in Proc. 1st Workshop on Question Answering over Linked Data (QALD-1),(ESWC), (2011).
[11]
Herzig, D. M. and Tran, T. Heterogeneous Web Data Search Using Relevance-based On The Fly Data Integration, in Proc. of 21st Intl. World Wide Web Conference (WWW), (2012).
[12]
Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.-C., Gerber, D. and Cimiano, P. Template-based Question Answering over RDF Data, in Proc. of 21st Int. World Wide Web Conference, (2012), pp. 639--648.
[13]
Yahya, M., Berberich, K., Elbassuoni, S., Ramanath, M., Tresp, V. and Weikum, G. Natural Language Questions for the Web of Data, EMNLP, (2012), pp. 379--390.
[14]
Novacek, V. Handschuh, S. and Decker, S. Getting the Meaning Right: A Complementary Distributional Layer for the Web Semantics, in Proc. of the Intl. Semantic Web Conference, (2011), pp. 504--519.

Cited By

View all
  • (2024)Influence of Event Specialization Strategy on Some Aspects of Natural Language Querying Interfaces to OntologiesIEEE Access10.1109/ACCESS.2024.348988912(165780-165796)Online publication date: 2024
  • (2022)Efficient SPARQL Queries Generator for Question Answering SystemsIEEE Access10.1109/ACCESS.2022.320679410(99850-99860)Online publication date: 2022
  • (2021)Intelligent SPARQL Query Generation for Natural Language Processing SystemsIEEE Access10.1109/ACCESS.2021.31306679(158638-158650)Online publication date: 2021
  • Show More Cited By

Recommendations

Reviews

Yingjie Li

Using natural language to query large amounts of heterogeneous linked data is a difficult problem. This paper proposes a distributional-compositional semantics approach to answer queries over linked data. Without heavily relying on ontologies, this approach is data centered and driven. Researchers in natural language processing (NLP), linked data, the semantic web, and query answering will want to study this work. The proposed approach first constructs a distributional semantic vector space, including a term vector space and a concept vector space, for a given data collection. The term vector space is built upon all terms available in the given dataset. The concept vector space is built upon the extraction of co-occurrence patterns of each term from the dataset. Then the resource description framework (RDF) graph data is rewritten using the term and concept vector spaces. Each RDF predicate is represented to be a weighted concept vector, and each RDF instance is represented to be a weighted term vector. Based on the distributional and compositional semantic vector space, and given a natural language query, the proposed system first extracts a set of query features and resolves the query into a RDF-like format. After the query is analyzed, the system generates a query processing plan that maps the extracted query features and the semi-structured query representation into a set of search, navigation, and transformation operations over the queried data graph. Finally, the operations of the query processing plan are executed over the semantic vector space to identify the results to the query. The experiments tested the proposed approach using the question answering over linked data 2011 test collection (QALD-1). The results demonstrate that the proposed system achieves better recall, precision, and reciprocal rank compared to other systems, including PowerAqua and FREyA. However, the paper would have been more comprehensive if the authors had evaluated the proposed system over more than one dataset. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
IUI '14: Proceedings of the 19th international conference on Intelligent User Interfaces
February 2014
386 pages
ISBN:9781450321846
DOI:10.1145/2557500
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. databases
  2. distributional semantics
  3. linked data
  4. natural language interface
  5. question answering
  6. semantic interface
  7. semantic search
  8. semantic web

Qualifiers

  • Research-article

Conference

IUI'14
Sponsor:

Acceptance Rates

IUI '14 Paper Acceptance Rate 46 of 191 submissions, 24%;
Overall Acceptance Rate 746 of 2,811 submissions, 27%

Upcoming Conference

IUI '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Influence of Event Specialization Strategy on Some Aspects of Natural Language Querying Interfaces to OntologiesIEEE Access10.1109/ACCESS.2024.348988912(165780-165796)Online publication date: 2024
  • (2022)Efficient SPARQL Queries Generator for Question Answering SystemsIEEE Access10.1109/ACCESS.2022.320679410(99850-99860)Online publication date: 2022
  • (2021)Intelligent SPARQL Query Generation for Natural Language Processing SystemsIEEE Access10.1109/ACCESS.2021.31306679(158638-158650)Online publication date: 2021
  • (2020)Interacting with Linked Data: A Survey from the SIGCHI PerspectiveExtended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems10.1145/3334480.3382909(1-12)Online publication date: 25-Apr-2020
  • (2020)Intelligent Text Clustering Based on Semantics Similarity2020 1st. Information Technology To Enhance e-learning and Other Application (IT-ELA10.1109/IT-ELA50150.2020.9253127(60-66)Online publication date: 12-Jul-2020
  • (2019)A Novel IR for Relational Database using Optimize Query BuildingInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology10.32628/CSEIT195438(251-257)Online publication date: 5-Aug-2019
  • (2019)Leveraging Linked Open Data to Automatically Answer Arabic QuestionsIEEE Access10.1109/ACCESS.2019.29562337(177122-177136)Online publication date: 2019
  • (2019)Crowdsourcing-based semantic relation recognition for natural language questions over RDF dataEnterprise Information Systems10.1080/17517575.2019.159738513:7-8(935-958)Online publication date: 30-Mar-2019
  • (2019)A Real-time Linked Dataspace for the Internet of Things: Enabling “Pay-As-You-Go” Data Management in Smart EnvironmentsFuture Generation Computer Systems10.1016/j.future.2018.07.01990(405-422)Online publication date: Jan-2019
  • (2018)Semantic Oriented Document Clustering Using Distribution SemanticsProceedings of the 2nd International Conference on Information System and Data Mining10.1145/3206098.3206110(14-18)Online publication date: 9-Apr-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media