Computer Science > Computation and Language

arXiv:2403.11858 (cs)

[Submitted on 18 Mar 2024]

Title:GPT-4 as Evaluator: Evaluating Large Language Models on Pest Management in Agriculture

Authors:Shanglong Yang, Zhipeng Yuan, Shunbao Li, Ruoling Peng, Kang Liu, Po Yang

Abstract:In the rapidly evolving field of artificial intelligence (AI), the application of large language models (LLMs) in agriculture, particularly in pest management, remains nascent. We aimed to prove the feasibility by evaluating the content of the pest management advice generated by LLMs, including the Generative Pre-trained Transformer (GPT) series from OpenAI and the FLAN series from Google. Considering the context-specific properties of agricultural advice, automatically measuring or quantifying the quality of text generated by LLMs becomes a significant challenge. We proposed an innovative approach, using GPT-4 as an evaluator, to score the generated content on Coherence, Logical Consistency, Fluency, Relevance, Comprehensibility, and Exhaustiveness. Additionally, we integrated an expert system based on crop threshold data as a baseline to obtain scores for Factual Accuracy on whether pests found in crop fields should take management action. Each model's score was weighted by percentage to obtain a final score. The results showed that GPT-3.4 and GPT-4 outperform the FLAN models in most evaluation categories. Furthermore, the use of instruction-based prompting containing domain-specific knowledge proved the feasibility of LLMs as an effective tool in agriculture, with an accuracy rate of 72%, demonstrating LLMs' effectiveness in providing pest management suggestions.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2403.11858 [cs.CL]
	(or arXiv:2403.11858v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.11858

Submission history

From: Shanglong Yang [view email]
[v1] Mon, 18 Mar 2024 15:08:01 UTC (146 KB)

Computer Science > Computation and Language

Title:GPT-4 as Evaluator: Evaluating Large Language Models on Pest Management in Agriculture

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:GPT-4 as Evaluator: Evaluating Large Language Models on Pest Management in Agriculture

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators