Abstract
This paper presents an analysis of a large neural network model—BERT, by placing its word prediction in context capability under the framework of Ontological Semantics. BERT has reportedly performed well in tasks that require semantic competence without any explicit semantic inductive bias. We posit that word prediction in context can be interpreted as the task of inferring the meaning of an unknown word. This practice has been employed by several papers following the Ontological Semantic Technology (OST) approach to Natural Language Understanding. Using this approach, we deconstruct BERT’s output for an example sentence and interpret it using OST’s fuzziness handling mechanisms, revealing the degree to which each output satisfies the sentence’s constraints.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Is-a relationship, for example: a dog is a mammal.
References
J. Devlin, M.W. Chang, K. Lee, Toutanova, K.: BERT, Pre-training of deep bidirectional transformers for language understanding (2019), pp. 4171–4186
A. Ettinger, What BERT is not: lessons from a new suite of psycholinguistic diagnostics for language models. Trans. Assoc. Comput. Ling. 8, 34–48 (2020)
J.R. Firth, A synopsis of linguistic theory 1930-1955. Studies in linguistic analysis (1957)
Z.S. Harris, Distributional structure. Word 10(2–3), 146–162 (1954)
C.F. Hempelmann, J.M. Taylor, V. Raskin, Application-guided ontological engineering, in ICAI 2010: Proceedings of the 2010 International Conference on Artificial Intelligence (Las Vegas NV, July 12–15, 2010), pp. 843–849
D. Jurafsky, Speech & Language Processing, 3rd edn. (2020)
J. Launchbury, A DARPA perpective on artificial intelligence (2019). https://www.darpa.mil/attachments/AIFull.pdf
T. Linzen, E. Dupoux, Y. Goldberg, Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Trans. Assoc. Comput. Ling. 4, 521–535 (2016)
K. Misra, A. Ettinger, J.T. Rayz, Exploring BERT’s sensitivity to lexical cues using tests from semantic priming, in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (2020), pp. 4625–4635
S. Nirenburg, V. Raskin, Ontological Semantics (MIT Press, New York, 2004)
V. Raskin, C.F. Hempelmann, J.M. Taylor, Guessing vs. knowing: The two approaches to semantics in natural language processing (2010), pp. 642–650
V. Raskin, J.M. Taylor, in The (not so) Unbearable Fuzziness of Natural Language: The Ontological Semantic Way of Computing with Words (IEEE, 2009), pp. 1–6
E. Reif, A. Yuan, M. Wattenberg, F.B. Viegas, A. Coenen, A. Pearce, B. Kim, Visualizing and measuring the geometry of BERT, in Advances in Neural Information Processing Systems (2019), pp. 8592–8600
M. van Schijndel, T. Linzen, Modeling garden path effects without explicit hierarchical syntax (2018)
J.M. Taylor, C.F. Hempelmann, V. Raskin, On an automatic acquisition toolbox for ontologies and lexicons in ontological semantics (2010), pp. 863–869
J.M. Taylor, V. Raskin, Fuzzy ontology for natural language, in 2010 Annual Meeting of the North American Fuzzy Information Processing Society (IEEE, 2010), pp. 1–6
J.M. Taylor, V. Raskin, Understanding the unknown: unattested input processing in natural language, in 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011) (IEEE, 2011), pp. 94–101
J.M. Taylor, V. Raskin, in Conceptual Defaults in Fuzzy Ontology (IEEE, 2016), pp. 1–6
J.M. Taylor, V. Raskin, C.F. Hempelmann, Towards computational guessing of unknown word meanings: the Ontological Semantic approach, in Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 33 (2011)
W.L. Taylor, “Cloze procedure”: a new tool for measuring readability. J. Q. 30(4), 415–433 (1953)
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, in Advances in Neural Information Processing Systems (2017), pp. 5998–6008
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Misra, K., Rayz, J.T. (2022). An Approximate Perspective on Word Prediction in Context: Ontological Semantics Meets BERT. In: Bede, B., Ceberio, M., De Cock, M., Kreinovich, V. (eds) Fuzzy Information Processing 2020. NAFIPS 2020. Advances in Intelligent Systems and Computing, vol 1337. Springer, Cham. https://doi.org/10.1007/978-3-030-81561-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-81561-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81560-8
Online ISBN: 978-3-030-81561-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)