[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3031843.3031909guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

The Winograd schema challenge

Published: 10 June 2012 Publication History

Abstract

In this paper, we present an alternative to the Turing Test that has some conceptual and practical advantages. A Wino-grad schema is a pair of sentences that differ only in one or two words and that contain a referential ambiguity that is resolved in opposite directions in the two sentences. We have compiled a collection of Winograd schemas, designed so that the correct answer is obvious to the human reader, but cannot easily be found using selectional restrictions or statistical techniques over text corpora. A contestant in the Winograd Schema Challenge is presented with a collection of one sentence from each pair, and required to achieve human-level accuracy in choosing the correct disambiguation.

References

[1]
Bobrow, D.; Condoravdi, C.; Crouch, R.; de Paiva, V.; Karttunen, L.; King, T.; Mairn, B.; Price, L.; and Zaenen, A. 2007. Precision-focussed Textual Inference. In Proc. Workshop on Textual Entailment and Paraphrasing.
[2]
Brachman, R., and Levesque, H. 2004. Knowledge Representation and Reasoning. Morgan Kaufman.
[3]
Christian, B. 2011. Mind vs. Machine. Atlantic Monthly. March 2011.
[4]
Cohen, P. 2004. If not the Turing Test, Then What? In Proceedings of the Nineteenth National Conference on Artificial Intelligence. Menlo Park, Calif.: AAAI Press.
[5]
Cooper, R.; Crouch, D.; Eijckl, J. V.; Fox, C.; Genabith, J. V.; Japars, J.; Kamp, H.; Milward, D.; Pinkal, M.; Poesio, M.; and Pulman, S. 1996. A Framework for Computational Semantics (FraCaS). Technical report, The FraCaS Consortium.
[6]
Dagan, I.; Glicksman, O.; and Magnini, B. 2006. The PASCAL recognising textual entailment challenge. In Machine Learning Challenges: LNAI 3944. Springer Verlag.
[7]
Davis, E. 2012. Qualitative Spatial Reasoning in Interpreting Text and Narrative. Spatial Cognition and Computation. Forthcoming.
[8]
Dennett, D. 1998. Can Machines Think? In Mather, G.; Verstraten, F.; and Anstis, S., eds., The Motion Aftereffect. MIT Press.
[9]
Etzioni, O.; Banko, M.; and Cafarella, M. 2006. Machine Reading. In Proceedings of the Twenty-First National Conference on Artificial Intelligence. Menlo Park, Calif.: AAAI Press.
[10]
Ford, K., and Hayes, P. 1995. Turing Test Considered Harmful. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 972-977. San Mateo, Calif.: Morgan Kaufmann.
[11]
Hanks, S., and McDermott, D. 1987. Nonmonotonic Logic and Temporal Projection. Artificial Intelligence 33(3):379-412.
[12]
Hawkins, J., and Blakeslee, S. 2004. On Intelligence. New York: Times Books.
[13]
Hernandez-Orallo, J., and Dowe, D. L. 2010. Measuring Universal Intelligence: Toward an Anytime Intelligence Test. Artificial Intelligence 174(18):1508-1539.
[14]
Kahneman, D. 2011. Thinking, Fast and Slow. Farrar, Straus, and Giroux.
[15]
Lapata, M., and Keller, F. 2005. Web-based Models for Natural Language Processing. ACM Transactions on Speech and Language Processing 2(1).
[16]
Levesque, H. 2009. Is it Enough to get the Behaviour Right? In Proceedings of the Twenty-first International Joint Conference on Artificial Intelligence. San Mateo, Calif.: Morgan Kaufmann.
[17]
Majumdar, D., and Bhattacharyya, P. 2010. Lexical Based Text Entailment System for Main Task of RTE6. In Proceedings, Text Analysis Conference, NIST.
[18]
Manning, C., and Schütze, H. 1999. Foundations of Statistical Natural Language Processing. Cambridge, Mass.: MIT Press.
[19]
McCarthy, J. 1959. Programs with Common Sense. In Proceedings of the Teddington Conference on the Mechanization of Thought Processes. London: Her Majesty's Stationery Office.
[20]
Pylyshyn, Z. 1984. Computation and Cognition: Toward a Foundation for Cognitive Science. Cambridge, Mass.: MIT Press.
[21]
Roemmele, M.; Bejan, C.; and Gordon, A. 2011. Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning. In Proceedings, International Symposium on Logical Formalizations of Commonsense Reasoning.
[22]
Rus, V.; McCarthy, P.; McNamara, D.; and Graesser, A. 2007. A Study of Textual Entailment. International Journal of Artificial Intelligence Tools 17.
[23]
Shieber, S. 1994. Lessons from a Restricted Turing Test. Communications of the ACM 37(6):70-78.
[24]
Strassel, S.; Adams, D.; Goldberg, H.; Herr, J.; Keesing, R.; Oblinger, D.; Simpson, H.; Schrag, R.; and Wright, J. 2010. The DARPA Machine Reading Program - Encouraging Linguistic and Reasoning Research with a Series of Reading Tasks. In International Conference on Language Resources and Evaluation (LREC).
[25]
von Ahn, L.; Blum, M.; Hopper, N.; and Langford, J. 2003. CAPTCHA: Using Hard AI Problems for Security. In Eurocrypt-2003, 294-311.
[26]
Weizenbaum, J. 1966. ELIZA — A Computer Program for the Study of Natural Language Communication between Man and Machine. Communications of the ACM 9(1):36-45.
[27]
Whitby, B. 1996. Why the Turing Test is AI's Biggest Blind Alley. In Millican, P., and Clark, A., eds., Machine and Thought. Oxford University Press.
[28]
Winograd, T. 1972. Understanding Natural Language. New York: Academic Press.

Cited By

View all
  • (2024)Random masking finds winning tickets for parameter efficient fine-tuningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694358(55501-55524)Online publication date: 21-Jul-2024
  • (2024)Localizing task information for improved model merging and compressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694127(50268-50287)Online publication date: 21-Jul-2024
  • (2024)Beyond Chinchilla-optimalProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693840(43445-43460)Online publication date: 21-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
KR'12: Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning
June 2012
652 pages

Publisher

AAAI Press

Publication History

Published: 10 June 2012

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Random masking finds winning tickets for parameter efficient fine-tuningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694358(55501-55524)Online publication date: 21-Jul-2024
  • (2024)Localizing task information for improved model merging and compressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694127(50268-50287)Online publication date: 21-Jul-2024
  • (2024)Beyond Chinchilla-optimalProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693840(43445-43460)Online publication date: 21-Jul-2024
  • (2024)Evaluating quantized large language modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693214(28480-28524)Online publication date: 21-Jul-2024
  • (2024)On the origins of linear representations in large language modelsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692949(21879-21911)Online publication date: 21-Jul-2024
  • (2024)PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial PromptsProceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis10.1145/3689217.3690621(57-68)Online publication date: 19-Nov-2024
  • (2024)Comparing ChatGPT and Humans on World Knowledge and Common-sense Reasoning Tasks: A case study of the Japanese Winograd Schema ChallengeExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650975(1-9)Online publication date: 11-May-2024
  • (2023)Language is not all you needProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669277(72096-72109)Online publication date: 10-Dec-2023
  • (2023)COCO-counterfactualsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669239(71195-71221)Online publication date: 10-Dec-2023
  • (2023)BayesTuneProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668972(65317-65365)Online publication date: 10-Dec-2023
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media