Authors:
Marco Antoni
1
;
Andrea Schwertner Charão
1
and
Maria Helena Franciscatto
2
Affiliations:
1
Department of Languages and Computer Systems, Federal University of Santa Maria, Santa Maria, Brazil
;
2
Department of Computer Science, Federal University of Paraná, Curitiba, Brazil
Keyword(s):
Question Answering, Open Data, Educational Information Retrieval, Natural Language Processing.
Abstract:
The need for capturing information suitable to the user has favored the development of Question Answering (QA) systems, whose main goal is retrieving a precise answer to a question expressed in Natural Language. Thus, these systems have been adopted in many domains to make data accessible, including Open Data. Although there are many QA approaches that access Open Data sources, querying Brazilian Open Data is still a research gap, possibly motivated by the complexity that Portuguese language presents to Natural Language Processing (NLP) approaches. For this reason, this paper proposes a hybrid NLP-based approach for querying Open Data of Brazilian Educational Census. The proposed solution is based on a combination of linguistic and rule-based NLP approaches, that are applied in two main processing stages (Text Preprocessing and Question Mapping) to identify the meaning of an input question and optimize the querying process. Our approach was evaluated through a QA prototype developed
as a Web interface and showed feasible results, since concise and accurate answers were presented to the user.
(More)