More Web Proxy on the site http://driver.im/

extended-abstract

Open access

Examining the Feasibility of AI-Generated Questions in Educational Settings

Authors:

Omar Zeghouani,

William Simson van Dijkhuizen,

Jeremie ClosAuthors Info & Claims

TAS '24: Proceedings of the Second International Symposium on Trustworthy Autonomous Systems

Article No.: 36, Pages 1 - 6

https://doi.org/10.1145/3686038.3686652

Published: 16 September 2024 Publication History

All formats PDF

Abstract

Educators face ever-growing time constraints, leading to poor work-life balance and a negative impact on work quality. Through their language generation capabilities, large language models offer an interesting avenue to ease this academic workload, allowing both students and lecturers to generate educational content. In this work, we leverage the latest developments in automatic speech recognition, natural language generation, retrieval-augmented generation, and multimodal models to design the Augmented Lecture Integration Network (ALINet), a system capable of producing a diverse range of high-quality assessment questions from lecture content. We inform the design of our system through a series of automated experiments using public datasets and evaluate it with a user study conducted on students and educators. Our results indicate a generally positive perception of the system’s performance, particularly in generating natural and clear questions relevant to the taught content, demonstrating its potential as a valuable resource in educational settings. This project lays the foundation for future research in multimodal educational question generation and is available for reuse in our public repository.

References

[1]

Max Bartolo, Alastair Roberts, Johannes Welbl, Sebastian Riedel, and Pontus Stenetorp. 2020. Beat the AI: Investigating adversarial human annotation for reading comprehension. Transactions of the Association for Computational Linguistics 8 (2020), 662–678.

[2]

Ayan Kumar Bhowmick, Ashish Jagmohan, Aditya Vempaty, Prasenjit Dey, Leigh Hall, Jeremy Hartman, Ravi Kokku, and Hema Maheshwari. 2023. Automating question generation from educational text. In International Conference on Innovative Techniques and Applications of Artificial Intelligence. Springer, 437–450.

Digital Library

[3]

Benjamin S Bloom, Max D Engelhart, EJ Furst, Walker H Hill, and David R Krathwohl. 1956. Handbook I: cognitive domain. New York: David McKay (1956), 483–498.

[4]

Guanliang Chen, Jie Yang, Claudia Hauff, and Geert-Jan Houben. 2018. LearningQ: a large-scale dataset for educational question generation. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 12.

[5]

Simon Cross, Denise Whitelock, and Jenna Mittelmeier. 2016. Does the quality and quantity of exam revision impact on student satisfaction and performance in the exam itself?: Perspectives from undergraduate distance learners. (2016).

[6]

Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, and Danqi Chen. 2019. MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension. In Proceedings of the 2nd Workshop on Machine Reading for Question Answering, Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, and Danqi Chen (Eds.). Association for Computational Linguistics, Hong Kong, China, 1–13. https://doi.org/10.18653/v1/D19-5801

[7]

Mark Gales, Steve Young, 2008. The application of hidden Markov models in speech recognition. Foundations and Trends® in Signal Processing 1, 3 (2008), 195–304.

[8]

Sanchit Gandhi, Patrick von Platen, and Alexander M Rush. 2023. Distil-whisper: Robust knowledge distillation via large-scale pseudo labelling. arXiv preprint arXiv:2311.00430 (2023).

[9]

R. Goyal, P. Kumar, and V. P. Singh. 2023. Automated Question and Answer Generation from Texts using Text-to-Text Transformers. Arab Journal of Science and Engineering (2023). https://doi.org/10.1007/s13369-023-07840-7

[10]

Jing Gu, Mostafa Mirshekari, Zhou Yu, and Aaron Sisto. 2021. Chaincqg: Flow-aware conversational question generation. arXiv preprint arXiv:2102.02864 (2021).

[11]

Samuel C Karpen. 2018. The Social Psychology of Biased Self-Assessment. American Journal of Pharmaceutical Education 82, 5 (2018), 6299.

[12]

Tomáš Kočiskỳ, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, and Edward Grefenstette. 2018. The narrativeqa reading comprehension challenge. Transactions of the Association for Computational Linguistics 6 (2018), 317–328.

[13]

Kettip Kriangchaivech and Artit Wangperawong. 2019. Question generation by transformers. arXiv preprint arXiv:1909.05017 (2019).

[14]

Devang Kulshreshtha, Robert Belfer, Iulian Serban, and Siva Reddy. 2021. Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval. In Conference on Empirical Methods in Natural Language Processing. https://api.semanticscholar.org/CorpusID:233295982

[15]

Devang Kulshreshtha, Muhammad Shayan, Robert Belfer, Siva Reddy, Iulian Vlad Serban, and Ekaterina Kochmar. 2022. Few-shot question generation for personalized feedback in intelligent tutoring systems. In PAIS 2022. IOS Press, 17–30.

[16]

Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs’ ka, Xiang’Anthony’ Chen, and Caiming Xiong. 2022. Discord questions: A computational approach to diversity analysis in news coverage. arXiv preprint arXiv:2211.05007 (2022).

[17]

Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs’ka, Wenhao Liu, and Caiming Xiong. 2022. Quiz Design Task: Helping Teachers Create Quizzes with Automated Question Generation. In NAACL-HLT. https://api.semanticscholar.org/CorpusID:248512983

[18]

Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. 2017. RACE: Large-scale ReAding Comprehension Dataset From Examinations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 785–794. https://doi.org/10.18653/v1/D17-1082

[19]

Chia-Hsuan Lee, Szu-Lin Wu, Chi-Liang Liu, and Hung yi Lee. 2018. Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension. In Proc. Interspeech 2018. 3459–3463. https://doi.org/10.21437/Interspeech.2018-1714

[20]

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703

[21]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.

[22]

Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.

[23]

David Lindberg, Fred Popowich, John Nesbit, and Phil Winne. 2013. Generating natural language questions to support learning on-line. In Proceedings of the 14th European workshop on natural language generation. 105–114.

[24]

Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, and Charibeth Cheng. 2020. Transformer-based end-to-end question generation. arXiv preprint arXiv:2005.01107 4 (2020).

[25]

C. L. Cooper M. Y. Tytherleigh *, C. Webb and C. Ricketts. 2005. Occupational stress in UK higher education institutions: a comparative study of all staff categories. Higher Education Research & Development 24, 1 (2005), 41–61.

[26]

Christopher A McKay, Juan Razo, and Adam M Persky. 2019. The Self-Assessment of Pharmacy Students: A Mixed-Methods Study. American Journal of Pharmaceutical Education 83, 9 (2019), 7323.

[27]

Mitchell J. Nathan and Anthony Petrosino. 2003. Expert Blind Spot Among Preservice Teachers. American Educational Research Journal 40, 4 (2003), 905–928. https://doi.org/10.3102/00028312040004905

[28]

Huy A. Nguyen, Shravya Bhat, Steven Moore, Norman Bier, and John Stamper. 2022. Towards Generalized Methods for Automatic Question Generation in Educational Domains. In Educating for a New Future: Making Sense of Technology-Enhanced Learning Adoption, Isabel Hilliger, Pedro J. Muñoz-Merino, Tinne De Laet, Alejandro Ortega-Arranz, and Tracie Farrell (Eds.). Springer International Publishing, Cham, 272–284.

[29]

Rodney D Nielsen, Jason Buckingham, Gary Knoll, Ben Marsh, and Leysia Palen. 2008. A taxonomy of questions for question generation. In Proceedings of the workshop on the question generation shared task and evaluation challenge.

[30]

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.

[31]

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning. PMLR, 28492–28518.

[32]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1–67. http://jmlr.org/papers/v21/20-074.html

[33]

Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Jian Su, Kevin Duh, and Xavier Carreras (Eds.). Association for Computational Linguistics, Austin, Texas, 2383–2392. https://doi.org/10.18653/v1/D16-1264

[34]

Kathleen Smithers, Nerida Spina, Jess Harris, and Sarah Gurr. 2023. Working every weekend: The paradox of time for insecurely employed academics. Time & Society 32, 1 (01 Feb 2023), 101–122.

[35]

UKRI. 2023. Framework for responsible research and innovation. https://www.ukri.org/who-we-are/epsrc/our-policies-and-standards/framework-for-responsible-innovation/

[36]

Hei-Chia Wang, Martinus Maslim, and Chia-Hao Kan. 2023. A question–answer generation system for an asynchronous distance learning platform. Education and Information Technologies (2023), 1–30.

[37]

Johannes Welbl, Nelson F. Liu, and Matt Gardner. 2017. Crowdsourcing Multiple Choice Science Questions. In Proceedings of the 3rd Workshop on Noisy User-generated Text, Leon Derczynski, Wei Xu, Alan Ritter, and Tim Baldwin (Eds.). Association for Computational Linguistics, Copenhagen, Denmark, 94–106. https://doi.org/10.18653/v1/W17-4413

[38]

Brenda W Yang, Juan Razo, and Adam M Persky. 2019. Using Testing as a Learning Tool. American Journal of Pharmaceutical Education 83, 9 (2019), 7324.

[39]

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=SkeHuCVFDr

Index Terms

Index terms have been assigned to the content through auto-classification.

Recommendations

A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education
ACE '24: Proceedings of the 26th Australasian Computing Education Conference

There is a constant need for educators to develop and maintain effective up-to-date assessments. While there is a growing body of research in computing education on utilizing large language models (LLMs) in generation and engagement with coding ...
Automating Question Generation From Educational Text
Artificial Intelligence XL
Abstract
The use of question-based activities (QBAs) is wide-spread in education, traditionally forming an integral part of the learning and assessment process. In this paper, we design and evaluate an automated question generation tool for formative and ...
Key Phrase Extraction for Generating Educational Question-Answer Pairs
L@S '19: Proceedings of the Sixth (2019) ACM Conference on Learning @ Scale

Automatic question generation is a promising tool for developing the learning systems of the future. Research in this area has mostly relied on having answers (key phrases) identified beforehand and given as a feature, which is not practical for real-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

TAS '24: Proceedings of the Second International Symposium on Trustworthy Autonomous Systems

September 2024

335 pages

ISBN:9798400709890

DOI:10.1145/3686038

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2024

Check for updates

Author Tags

Qualifiers

Extended-abstract
Research
Refereed limited

Conference

TAS '24

TAS '24: Second International Symposium on Trustworthy Autonomous Systems

September 16 - 18, 2024

TX, Austin, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
119
Total Downloads

Downloads (Last 12 months)119
Downloads (Last 6 weeks)53

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents