Abstract
Machine readable dictionaries (Mrds) contain knowledge about language and the world essential for tasks in natural language processing (Nlp). However, this knowledge, collected and recorded by lexicographers for human readers, is not presented in a manner for Mrds to be used directly for Nlp tasks. What is badly needed are machine tractable dictionaries (Mtds): Mrds transformed into a format usable for Nlp. This paper discusses three different but related large-scale computational methods to transform Mrds into Mtds. The Mrd used is The Longman Dictionary of Contemporary English (Ldoce). The three methods differ in the amount of knowledge they start with and the kinds of knowledge they provide. All require some handcoding of initial information but are largely automatic. Method I, a statistical approach, uses the least handcoding. It generates “relatedness” networks for words in Ldoce and presents a method for doing partial word sense disambiguation. Method II employs the most handcoding because it develops and builds lexical entries for a very carefully controlled defining vocabulary of 2,000 word senses (1,000 words). The payoff is that the method will provide an Mtd containing highly structured semantic information. Method III requires the handcoding of a grammar and the semantic patterns used by its parser, but not the handcoding of any lexical material. This is because the method builds up lexical material from sources wholly within Ldoce. The information extracted is a set of sources of information, individually weak, but which can be combined to give a strong and determinate linguistic data base.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alshawi, Hiyan. 1987. Processing Dictionary Definitions with Phrasal Pattern Hierarchies. Computational Linguistics, 13: 203–218.
Alshawi, H., B. Boguraev and T. Briscoe. 1985. Towards a Dictionary Support Environment for Real Time Parsing. Proceedings of the 2nd European Conference on Computational Linguistics, Geneva, 171–178.
Amsler, R.A. 1980. The Structure of the Merriam-Webster Pocket Dictionary. Technical Report TR-164, University of Texas at Austin.
Amsler, R.A. 1981. A Taxonomy of English Nouns and Verbs. Proceedings of Acl-19, Stanford, 133–138.
Amsler, R.A. 1982. Computational Lexicology: A Research Program. AFIPS Conference Proceedings, 1982 National Computer Conference, 657–663.
Amsler, R.A., and J.S. White. 1979. Development of a Computational Methodology for Deriving Natural language Semantic Structures via Analysis of Machine-Readable Dictionaries. NSF Technical Report MCS77-01315.
Binot, J.-L., and K. Jensen. 1987. A Semantic Expert Using an Online Standard Dictionary. Proceedings of Ijcai-87, Milan, 709–714.
Boguraev, B.K. 1987. The Definitional Power of Words. Proceedings of the 3rd Workshop on Theoretical Issues in Natural Language Processing (Tinlap-3), Las Cruces, 11–15.
Boguraev, B.K., and T. Briscoe. 1987. Large Lexicons for Natural Language Processing: Exploring the Grammar Coding System of Ldoce. Computational Linguistics 13: 203–218.
Boguraev, B.K., T. Briscoe, J. Carroll, D. Carter and C. Grover. 1987. The Derivation of a Grammatically Indexed Lexicon from the Longman Dictionary of Contemporary English. Proceedings of Acl-25, Stanford, 193–200.
Byrd, R.J. 1989. Discovering Relationships Among Word Senses. In Proceedings of the 5th Conference of the UW Centre for the New Oed (Dictionaries in the Electronic Age). Oxford, 67–79.
Carre, B. 1979. Graphs and Networks. Clarendon Press: Oxford.
Chodorow, M.S., R.J. Byrd and G.E. Heidorn. 1985. Extracting Semantic Hierarchies from a Large On-Line Dictionary. In Proceedings of Acl-23, Chicago, 299–304.
Cottrell, G.W., and S.L. Small. 1983. A Connectionist Scheme for Modelling Word-Sense Disambiguation. Cognition and Brain Theory 6: 89–120.
Dietterich, T.G., and R. Michalski. 1981. Inductive Learning of Structural Descriptions. Artificial Intelligence 16: 257–294.
Evens, M., and R.N. Smith. 1983. Determination of Adverbial Senses from Webster's Seventh Collegiate Definitions. Paper presented at Workshop on Machine Readable Dictionaries, SRI-International, April 1983.
Fass, D.C. (1986. Collative Semantics: An Approach to Coherence. Memorandum in Computer and Cognitive Science, MCCS-86-56, Computing Research Laboratory, New Mexico State University, Las Cruces.
Fass, D.C. (1988a). Collative Semantics: A Semantics for Natural Language Processing. Memorandum in Computer and Cognitive Science, MCCS-88-118, Computing Research Laboratory, New Mexico State University, Las Cruces.
Fass, D.C. 1988b. Metonymy and Metaphor: What's the Difference? In Proceedings of Coling-88, Budapest, 177–181.
Fass, D.C. 1988c. An Account of Coherence, Semantic Relations, Metonymy, and Lexical Ambiguity Resolution. In S.L. Small, G.W. Cottrell and M.K. Tanenhaus (Eds., Lexical Ambiguity Resolution in the Comprehension of Human Language. Los Altos: Morgan Kaufmann, 151–178.
Fass, D.C., and Y.A. Wilks. 1983. Preference Semantics, Ill-Formedness and Metaphor. American Journal of Computational Linguistics 9: 178–187.
Guo, C. 1987. Interactive Vocabulary Acquisition in Xtra. In Proceedings of Ijcai-87, Milan, 715–717.
Harary, F. 1969. Graph Theory. Reading, MA: Addison-Wesley.
Harris, Z. 1951. Structural Linguistics. Chicago: University of Chicago Press.
Hobbs, J.R. 1987. World Knowledge and World Meaning. In Proceedings of the 3rd Workshop on Theoretical Issues in Natural Language Processing (Tinlap-3), Las Cruces, 20–25.
Jensen, K., and J.-L. Binot. 1987. Disambiguating Prepositional Phrase Attachments by Using On-Line Dictionary Definitions. Computational Linguistics 13: 251–260.
Johnson, S.C. 1967. Hierarchical Clustering Schemes. Psychometrika 32: 241–254.
Kegl, J. 1987. The Boundary Between Word Knowledge and World Knowledge. In Proceedings of the 3rd Workshop on Theoretical Issues in Natural Language Processing (Tinlap-3), Las Cruces, 26–31.
Kucera, H., and W.N. Francis. 1967. Computational Analysis of Present-Day American English. Providence, RI: Brown University Press.
Lenat, D.B., and E.A. Feigenbaum. 1987. On The Thresholds of Knowledge. In Proceedings of Ijcai-87, Milan, 1173–1182.
Lenat, D.B., M. Prakash and M. Shepherd. 1986. Cyc: Using Common Sense Knowledge to Overcome Brittleness and Knowledge Acquisition Bottlenecks. AI Magazine 7 (4): 65–85.
Lesk, M.E. 1986. Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Crean Cone. In Proceedings of the Acm Sigdoc Conference, Toronto, 24–26.
Lyons, J. 1977. Semantics, Volume 2. Cambridge: Cambridge University Press.
McClelland, J., D.E. Rumelhart and the PDP Research Group (Eds.). 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Two Volumes, Volume 2: Psychological and Biological Models. Cambridge, MA: MIT Press/Bradford Books.
McDonald, J.E., T. Plate and R.W. Schvaneveldt. Forthcoming. Using Pathfinder to Analyse a Dictionary. In R. Schvaneveldt (Ed.), untitled.
Markowitz, J., T. Ahlswede and M. Evens. 1986. Semantically Significant Patterns in Dictionary Definitions. In Proceedings of Acl-24, New York, 112–119.
Masterman, M. 1957. The Thesaurus in Syntax and Semantics. Mechanical Translation 4: 1–2.
Michiels, A., J. Mullenders and J. Noel. 1980. Exploiting a Large Data Base by Longman. In Proceedings of Coling-80, Tokyo, 374–382.
Miller, G.A. 1985. Dictionaries of the Mind. In Proceedings of Acl-23, Chicago, 305–314.
Newell, A. 1973. Artificial Intelligence and the Concept of Mind. In R.C. Schank and K.M. Colby (Eds.), Computer Models of Thought and Language. San Francisco: W.H. Freeman, 1–60.
Ogden, C.K. 1942. The General Basic English Dictionary. New York: W.W Norton.
Procter, P. et al., (Eds.). 1978. Longman Dictionary of Contemporary English. Harlow, Essex: Longman.
Pulman, S.G. 1985. Generalised Phrase Structure Grammar, Earley's Algorithm, and the Minimisation of Recursion. In K. Sparck Jones and Y.A. Wilks (Eds.), Automatic Natural Language Parsing. New York: John Wiley and Sons, 117–131.
Pustejovsky, J., and S. Bergler. 1987. The Acquisition of Conceptual Structure for the Lexicon. In Proceedings of Aaai-87, Seattle, 556–570.
Quillian, M.R. 1967. Word Concepts: A Theory and Simulation of Some Basic Semantic Capabilities. Behavioral Science 12: 410–430. Reprinted in R.J. Brachman and H.J. Levesque (Eds.), Readings in Knowledge Representation. Los Altos: Morgan Kaufmann, 1985, 98–118.
Quirk, R., S. Greenbaum, G. Leech and J. Svartik. 1972. A Grammar of Contemporary English. Harlow, Essex: Longman.
Quirk, R., S. Greenbaum, G. Leech and J. Svartik. 1985. A Comprehensive Grammar of English. Harlow, Essex: Longman.
St. John, M.F., and J.L. McClelland. 1986. Reconstructive Memory for Sentences: A PDP Approach. Ohio University Inference Conference.
Sampson, G. 1986. A Stochastic Approach to Parsing. In Proceedings of Coling-86, Bonn, 151–155.
Schvaneveldt, R.W., and F.T. Durso. 1981. Generalized Semantic Networks. Paper presented at the meeting of the Psychonomic Society, Philadelphia.
Schvaneveldt, R.W., F.T. Durso and D.W. Dearholt. 1985. Pathfinder: Scaling with Network Structure. Memorandum in Computer and Cognitive Science, MCCS-85-9, Computing Research Laboratory, New Mexico State University, Las Cruces.
Shortliffe, E.H. 1976. Computer-Based Medical Consultation: Mycin. New York: Elsevier.
Slator, B.M. 1988a. Lexical Semantics and a Preference Semantics Parser. Memorandum in Computer and Cognitive Science, MCCS-88-116, Computing Research Laboratory, New Mexico State University, Las Cruces.
Slator, B.M. 1988b. Premo: the PREference Machine Organization. In Proceedings of the Third Annual Rocky Mountain Conference on Artificial Intelligence, Denver, 258–265.
Slator, B.M. 1988c. Constructing Contextually Organized Lexical Semantic Knowledge-Bases. Proceedings of the Third Annual Rocky Mountain Conference on Artificial Intelligence, Denver, CO, 142–148.
Slator, B.M., and Y.A. Wilks. 1987. Toward Semantic Structures from Dictionary Entries. In Proceedings of the Second Annual Rocky Mountain Conference on Artificial Intelligence, Boulder, CO, 85–96. Also, Memorandum in Computer and Cognitive Science, MCCS-87-96, Computing Research Laboratory, New Mexico State University, Las Cruces.
Slocum, J. 1985. Parser Construction Techniques: A Tutorial. Tutorial held at the 23rd Annual Meeting of the Association for Computational Linguistics, Chicago.
Slocum, J., and M.G. Morgan. Forthcoming. The Role of Dictionaries and Machine Readable Lexicons in Translation. In D. Walker, A. Zampolli and N. Calzolari (eds.), Automating the Lexicon: Research and Practice in a Multilingual Environment. Cambridge: Cambridge University Press.
Sparck Jones, K. 1964. Synonymy and Semantic Classification. Ph.D. Thesis, University of Cambridge.
Sparck Jones, K. 1986. Synonymy and Semantic Classification. (Ph.D. thesis with new Foreword.) Edinburgh Information Technology Series (Edits). Edinburgh: Edinburgh University Press.
Walker, D.E., and R.A. Amsler. 1986. The Use of Machine-Readable Dictionaries in Sublanguage Analysis. In R. Grishman and R. Kittredge (Eds.), Analyzing Language in Restricted Domains. Hillsdale, NJ: Lawrence Erlbaum, 69–84.
Waltz, D.L., and J.B. Pollack. 1985. Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation. Cognitive Science 9: 51–74.
Wilks, Y.A. 1972. Grammar, Meaning, and the Machine Analysis of Language. Routledge and Kegan Paul: London.
Wilks, Y.A. 1973. An Artificial Intelligence Approach to Machine Translation. In R.C. Schank and K.M. Colby (Eds.), Computer Models of Thought and Language. San Francisco: W.H. Freeman, 114–151.
Wilks, Y.A. 1975a. A Preferential Pattern-Seeking Semantics for Natural Language Inference. Artificial Intelligence 6: 53–74.
Wilks, Y.A. 1975b. An Intelligent Analyser and Understander for English. Communications of the ACM 18: 264–274.
Wilks, Y.A. 1977. Good and Bad Arguments about Semantic Primitives. Communication and Cognition 10: 182–221.
Wilks, Y.A. 1978. Making Preferences More Active. Artificial Intelligence 10: 75–97.
Wilks, Y.A., D.C. Fass, C. Guo, J.E. McDonald, T. Plate and B.M. Slator. 1987. A Tractable Machine Dictionary as a Resource for Computational Semantics. Memorandum in Computer and Cognitive Science, MCCS-87-105, Computing Research Laboratory, New Mexico State University, Las Cruces. To appear in B. Boguraev and T. Briscoe (Eds.), Computational Lexicography for Natural Language Processing. Harlow, Essex: Longman.
Wilks, Y.A., D.C. Fass, C. Guo, J.E. McDonald, T. Plate and B.M. Slator. 1988. Machine Tractable Dictionaries as Tools and Resources for Natural Language Processing. In Proceedings of Coling-88, Budapest, 750–755.
Winston, P.H. 1975. Learning Structural Descriptions from Examples. In P.H. Winston (Ed.), The Psychology of Computer Vision. New York: McGraw-Hill.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wilks, Y., Fass, D., Guo, Cm. et al. Providing machine tractable dictionary tools. Machine Translation 5, 99–154 (1990). https://doi.org/10.1007/BF00393758
Issue Date:
DOI: https://doi.org/10.1007/BF00393758