More Web Proxy on the site http://driver.im/

Article

The Winograd schema challenge

Authors:

Hector J. Levesque,

Leora MorgensternAuthors Info & Claims

KR'12: Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning

Pages 552 - 561

Published: 10 June 2012 Publication History

Abstract

In this paper, we present an alternative to the Turing Test that has some conceptual and practical advantages. A Wino-grad schema is a pair of sentences that differ only in one or two words and that contain a referential ambiguity that is resolved in opposite directions in the two sentences. We have compiled a collection of Winograd schemas, designed so that the correct answer is obvious to the human reader, but cannot easily be found using selectional restrictions or statistical techniques over text corpora. A contestant in the Winograd Schema Challenge is presented with a collection of one sentence from each pair, and required to achieve human-level accuracy in choosing the correct disambiguation.

References

[1]

Bobrow, D.; Condoravdi, C.; Crouch, R.; de Paiva, V.; Karttunen, L.; King, T.; Mairn, B.; Price, L.; and Zaenen, A. 2007. Precision-focussed Textual Inference. In Proc. Workshop on Textual Entailment and Paraphrasing.

Digital Library

[2]

Brachman, R., and Levesque, H. 2004. Knowledge Representation and Reasoning. Morgan Kaufman.

Digital Library

[3]

Christian, B. 2011. Mind vs. Machine. Atlantic Monthly. March 2011.

[4]

Cohen, P. 2004. If not the Turing Test, Then What? In Proceedings of the Nineteenth National Conference on Artificial Intelligence. Menlo Park, Calif.: AAAI Press.

[5]

Cooper, R.; Crouch, D.; Eijckl, J. V.; Fox, C.; Genabith, J. V.; Japars, J.; Kamp, H.; Milward, D.; Pinkal, M.; Poesio, M.; and Pulman, S. 1996. A Framework for Computational Semantics (FraCaS). Technical report, The FraCaS Consortium.

[6]

Dagan, I.; Glicksman, O.; and Magnini, B. 2006. The PASCAL recognising textual entailment challenge. In Machine Learning Challenges: LNAI 3944. Springer Verlag.

Digital Library

[7]

Davis, E. 2012. Qualitative Spatial Reasoning in Interpreting Text and Narrative. Spatial Cognition and Computation. Forthcoming.

[8]

Dennett, D. 1998. Can Machines Think? In Mather, G.; Verstraten, F.; and Anstis, S., eds., The Motion Aftereffect. MIT Press.

[9]

Etzioni, O.; Banko, M.; and Cafarella, M. 2006. Machine Reading. In Proceedings of the Twenty-First National Conference on Artificial Intelligence. Menlo Park, Calif.: AAAI Press.

Digital Library

[10]

Ford, K., and Hayes, P. 1995. Turing Test Considered Harmful. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 972-977. San Mateo, Calif.: Morgan Kaufmann.

Digital Library

[11]

Hanks, S., and McDermott, D. 1987. Nonmonotonic Logic and Temporal Projection. Artificial Intelligence 33(3):379-412.

Digital Library

[12]

Hawkins, J., and Blakeslee, S. 2004. On Intelligence. New York: Times Books.

Digital Library

[13]

Hernandez-Orallo, J., and Dowe, D. L. 2010. Measuring Universal Intelligence: Toward an Anytime Intelligence Test. Artificial Intelligence 174(18):1508-1539.

Digital Library

[14]

Kahneman, D. 2011. Thinking, Fast and Slow. Farrar, Straus, and Giroux.

[15]

Lapata, M., and Keller, F. 2005. Web-based Models for Natural Language Processing. ACM Transactions on Speech and Language Processing 2(1).

Digital Library

[16]

Levesque, H. 2009. Is it Enough to get the Behaviour Right? In Proceedings of the Twenty-first International Joint Conference on Artificial Intelligence. San Mateo, Calif.: Morgan Kaufmann.

Digital Library

[17]

Majumdar, D., and Bhattacharyya, P. 2010. Lexical Based Text Entailment System for Main Task of RTE6. In Proceedings, Text Analysis Conference, NIST.

[18]

Manning, C., and Schütze, H. 1999. Foundations of Statistical Natural Language Processing. Cambridge, Mass.: MIT Press.

Digital Library

[19]

McCarthy, J. 1959. Programs with Common Sense. In Proceedings of the Teddington Conference on the Mechanization of Thought Processes. London: Her Majesty's Stationery Office.

[20]

Pylyshyn, Z. 1984. Computation and Cognition: Toward a Foundation for Cognitive Science. Cambridge, Mass.: MIT Press.

Digital Library

[21]

Roemmele, M.; Bejan, C.; and Gordon, A. 2011. Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning. In Proceedings, International Symposium on Logical Formalizations of Commonsense Reasoning.

[22]

Rus, V.; McCarthy, P.; McNamara, D.; and Graesser, A. 2007. A Study of Textual Entailment. International Journal of Artificial Intelligence Tools 17.

[23]

Shieber, S. 1994. Lessons from a Restricted Turing Test. Communications of the ACM 37(6):70-78.

Digital Library

[24]

Strassel, S.; Adams, D.; Goldberg, H.; Herr, J.; Keesing, R.; Oblinger, D.; Simpson, H.; Schrag, R.; and Wright, J. 2010. The DARPA Machine Reading Program - Encouraging Linguistic and Reasoning Research with a Series of Reading Tasks. In International Conference on Language Resources and Evaluation (LREC).

[25]

von Ahn, L.; Blum, M.; Hopper, N.; and Langford, J. 2003. CAPTCHA: Using Hard AI Problems for Security. In Eurocrypt-2003, 294-311.

Digital Library

[26]

Weizenbaum, J. 1966. ELIZA — A Computer Program for the Study of Natural Language Communication between Man and Machine. Communications of the ACM 9(1):36-45.

Digital Library

[27]

Whitby, B. 1996. Why the Turing Test is AI's Biggest Blind Alley. In Millican, P., and Clark, A., eds., Machine and Thought. Oxford University Press.

[28]

Winograd, T. 1972. Understanding Natural Language. New York: Academic Press.

Digital Library

Cited By

Xu JZhang JSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Random masking finds winning tickets for parameter efficient fine-tuningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694358(55501-55524)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694358
Wang KDimitriadis NOrtiz-Jiménez GFleuret FFrossard PSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Localizing task information for improved model merging and compressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694127(50268-50287)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694127
Sardana NPortes JDoubov SFrankle JSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Beyond Chinchilla-optimalProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693840(43445-43460)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693840
Show More Cited By

The Winograd schema challenge
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Language, common sense, and the Winograd schema challenge
Abstract
Since the 1950s, philosophers and AI researchers have held that disambiguating natural language sentences depended on common sense. In 2012, the Winograd Schema Challenge was established to evaluate the common-sense reasoning abilities of a ...
Overview of Morpho challenge 2008
CLEF'08: Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access

This paper gives an overview of Morpho Challenge 2008 competition and results. The goal of the challenge was to evaluate unsupervised algorithms that provide morpheme analyses for words in different languages. For morphologically complex languages, such ...
Definitive XML Schema

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

KR'12: Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning

June 2012

652 pages

ISBN:9781577355601

Publisher

AAAI Press

Publication History

Published: 10 June 2012

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu JZhang JSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Random masking finds winning tickets for parameter efficient fine-tuningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694358(55501-55524)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694358
Wang KDimitriadis NOrtiz-Jiménez GFleuret FFrossard PSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Localizing task information for improved model merging and compressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694127(50268-50287)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3694127
Sardana NPortes JDoubov SFrankle JSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Beyond Chinchilla-optimalProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693840(43445-43460)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693840
Li SNing XWang LLiu TShi XYan SDai GYang HWang YSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Evaluating quantized large language modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693214(28480-28524)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693214
Jiang YRajendran GRavikumar PAragam BVeitch VSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)On the origins of linear representations in large language modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692949(21879-21911)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692949
Zhu KWang JZhou JWang ZChen HWang YYang LYe WZhang YGong NXie XLi BXu WChen JZhang YXue JWang SBai GYuan X(2024)PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial PromptsProceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis10.1145/3689217.3690621(57-68)Online publication date: 19-Nov-2024
https://dl.acm.org/doi/10.1145/3689217.3690621
Reese MSmirnova A(2024)Comparing ChatGPT and Humans on World Knowledge and Common-sense Reasoning Tasks: A case study of the Japanese Winograd Schema ChallengeExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650975(1-9)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650975
Huang SDong LWang WHao YSinghal SMa SLv TCui LMohammed OPatra BLiu QAggarwal KChi ZBjorck JChaudhary VSom SSong XWei FOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Language is not all you needProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669277(72096-72109)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669277
Le TLal VHoward POh ANaumann TGloberson ASaenko KHardt MLevine S(2023)COCO-counterfactualsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669239(71195-71221)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3669239
Kim MHospedales TOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)BayesTuneProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668972(65317-65365)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668972
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents