Abstract
Open Information Extraction (OpenIE) is the task of extracting structured information from text. Recent advances in applying Deep Learning to OpenIE tasks have improved the state of the art for the task, although few works have been produced for languages other than English. In this work, we propose PortNOIE, a neural framework for open information extraction for the Portuguese language. We evaluate our method on a manually annotated corpus of Open IE extractions, obtaining better performance than the current state of the art for OpenIE for Portuguese, both based on rule-based approaches or neural methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
de Abreu, S.C., Vieira, R.: Relp: Portuguese open relation extraction. Knowl. Organ. 44(3), 163–177 (2017)
Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., Vollgraf, R.: Flair: an easy-to-use framework for state-of-the-art NLP. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pp. 54–59 (2019)
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction for the web. In: IJCAI, vol. 7, pp. 2670–2676 (2007)
Cabral, B.S., Glauber, R., Souza, M., Claro, D.B.: CrossOIE: Cross-Lingual Classifier for Open Information Extraction. In: Quaresma, P., Vieira, R., Aluísio, S., Moniz, H., Batista, F., Gonçalves, T. (eds.) PROPOR 2020. LNCS (LNAI), vol. 12037, pp. 368–378. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-41505-1_35
Cabral, B.S., Souza, M., Claro, D.B.: Explainable OpenIE classifier with Morpho-syntactic rules. In: HI4NLP@ ECAI, pp. 7–15 (2020)
Collovini, S., et al.: IberLEF 2019 Portuguese named entity recognition and relation extraction tasks. In: Proceedings of the Iberian Languages Evaluation Forum, vol. 2421, pp. 390–410. CEUR-WS.org (2019)
Cui, L., Wei, F., Zhou, M.: Neural open information extraction. arXiv preprint arXiv:1805.04270 (2018)
Del Corro, L., Gemulla, R.: Clausie: clause-based open information extraction. In: 22nd International Conference on World Wide Web, pp. 355–366. ACM (2013)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing,, Edinburgh, Scotland, pp. 1535–1545. Association for Computational Linguistics, July 2011
Gamallo, P., Garcia, M.: Multilingual open information extraction. In: Pereira, F., Machado, P., Costa, E., Cardoso, A. (eds.) EPIA 2015. LNCS (LNAI), vol. 9273, pp. 711–722. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23485-4_72
Gamallo, P., Garcia, M., Fernández-Lanza, S.: Dependency-based open information extraction. In: Joint Workshop on Unsupervised and Semi-supervised Learning in NLP, pp. 10–18. Association for Computational Linguistics (2012)
Gamallo, P., Garcia, M., Pineiro, C., Martinez-Castano, R., Pichel, J.C.: LinguaKit: a big data-based multilingual tool for linguistic analysis and information extraction. In: Fifth International Conference on Social Networks Analysis, Management and Security, pp. 239–244. IEEE (2018)
Gardner, M., et al.: AllenNLP: a deep semantic natural language processing platform (2017)
Glauber, R., Claro, D.B., de Oliveira, L.S.: Dependency parser on open information extraction for Portuguese texts - DptOIE and dependentie on IberLEF. In: Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019), vol. 2421, pp. 442–448. CEUR-WS.org (2019)
Hartmann, N.S., Fonseca, E.R., Shulby, C.D., Treviso, M.V., Rodrigues, J.S., Aluísio, S.M.: Portuguese word embeddings: evaluating on word analogies and natural language tasks. In: XI Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, Porto Alegre, RS, Brasil, pp. 122–131. SBC (2017)
He, L., Lee, K., Lewis, M., Zettlemoyer, L.: Deep semantic role labeling: what works and what’s next. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 473–483. Association for Computational Linguistics, July 2017. https://doi.org/10.18653/v1/P17-1044
Hohenecker, P., Mtumbuka, F., Kocijan, V., Lukasiewicz, T.: Systematic comparison of neural architectures and training approaches for open information extraction. In: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 8554–8565 (2020)
Hu, X., Zhang, C., Xu, Y., Wen, L., Yu, P.S.: Selfore: self-supervised relational feature learning for open relation extraction. arXiv preprint arXiv:2004.02438 (2020)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Jia, S., Xiang, Y.: Hybrid neural tagging model for open relation extraction. arXiv preprint arXiv:1908.01761 (2019)
Kolluru, K., Adlakha, V., Aggarwal, S., Chakrabarti, S., et al.: OpenIE 6: Iterative grid labeling and coordination analysis for open information extraction. arXiv preprint arXiv:2010.03147 (2020)
Lei, T.: When attention meets fast recurrence: training language models with reduced compute (2021)
Nivre, J., et al.: Universal dependencies v2: an evergrowing multilingual treebank collection. In: Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, pp. 4034–4043. European Language Resources Association, May 2020
de Oliveira, L.S., Glauber, R., Claro, D.B.: DependentIE: an open information extraction system on Portuguese by a dependence analysis. Encontro Nacional de Inteligência Artificial e Computacional (2017)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1532–1543. Association for Computational Linguistics, October 2014
Pereira, V., Pinheiro, V.: Report-um sistema de extração de informações aberta para língua portuguesa. In: X Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, pp. 191–200. SBC (2015)
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Natural Language Processing Using Very Large Corpora, pp. 157–176. Springer, Dordrecht (1999). https://doi.org/10.1007/978-94-017-2390-9
Ro, Y., Lee, Y., Kang, P.: Multi2OIE: multilingual open information extraction based on multi-head attention with BERT. arXiv preprint arXiv:2009.08128 (2020)
Sena, C.F.L., Claro, D.B.: InferPortOIE: a Portuguese open information extraction system with inferences. Nat. Lang. Eng. 25(2), 287–306 (2019)
Sena, C.F.L., Claro, D.B.: PragmaticOIE: a pragmatic open information extraction for Portuguese language. Knowl. Inf. Syst. 62(9), 3811–3836 (2020)
Sena, C.F.L., Glauber, R., Claro, D.B.: Inference approach to enhance a Portuguese open information extraction. In: Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 3: ICEIS. pp. 442–451. INSTICC, SciTePress (2017)
Souza, F., Nogueira, R., Lotufo, R.: BERTimbau: pretrained BERT models for Brazilian Portuguese. In: 9th Brazilian Conference on Intelligent Systems (2020)
Stanovsky, G., Michael, J., Zettlemoyer, L., Dagan, I.: Supervised open information extraction. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers), pp. 885–895 (2018)
Sun, M., Li, X., Wang, X., Fan, M., Feng, Y., Li, P.: Logician: a unified end-to-end neural approach for open-domain information extraction. In: Eleventh ACM International Conference on Web Search and Data Mining, pp. 556–564. ACM (2018)
Sun, M., Li, X., Wang, X., Fan, M., Feng, Y., Li, P.: Logician: a unified end-to-end neural approach for open-domain information extraction. In: Eleventh ACM International Conference on Web Search and Data Mining, pp. 556–564. ACM (2018)
Acknowledgement
We would like to thank FAPESB for financial support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Cabral, B., Souza, M., Claro, D.B. (2022). PortNOIE: A Neural Framework for Open Information Extraction for the Portuguese Language. In: Pinheiro, V., et al. Computational Processing of the Portuguese Language. PROPOR 2022. Lecture Notes in Computer Science(), vol 13208. Springer, Cham. https://doi.org/10.1007/978-3-030-98305-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-98305-5_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98304-8
Online ISBN: 978-3-030-98305-5
eBook Packages: Computer ScienceComputer Science (R0)