FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in LLMs

Abstract

Fuzzy reasoning is vital due to the frequent use of imprecise information in daily contexts. However, the ability of current large language models (LLMs) to handle such reasoning remains largely uncharted. In this paper, we introduce a new benchmark, FRoG, for fuzzy reasoning, featuring real-world mathematical word problems that incorporate generalized quantifiers. Our experimental findings reveal that fuzzy reasoning continues to pose significant challenges for LLMs. Moreover, we find that existing methods designed to enhance reasoning do not consistently improve performance in tasks involving fuzzy logic. Additionally, our results show an inverse scaling effect in the performance of LLMs on FRoG. Interestingly, we also demonstrate that strong mathematical reasoning skills are not necessarily indicative of success on our benchmark.

Anthology ID:: 2024.emnlp-main.411
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7239–7256
Language:
URL:: https://aclanthology.org/2024.emnlp-main.411
DOI:: 10.18653/v1/2024.emnlp-main.411
Bibkey:
Cite (ACL):: Yiyuan Li, Shichao Sun, and Pengfei Liu. 2024. FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in LLMs. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 7239–7256, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in LLMs (Li et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-main.411.pdf

PDF Cite Search