[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3686038.3686652acmotherconferencesArticle/Chapter ViewAbstractPublication PagestasConference Proceedingsconference-collections
extended-abstract
Open access

Examining the Feasibility of AI-Generated Questions in Educational Settings

Published: 16 September 2024 Publication History

Abstract

Educators face ever-growing time constraints, leading to poor work-life balance and a negative impact on work quality. Through their language generation capabilities, large language models offer an interesting avenue to ease this academic workload, allowing both students and lecturers to generate educational content. In this work, we leverage the latest developments in automatic speech recognition, natural language generation, retrieval-augmented generation, and multimodal models to design the Augmented Lecture Integration Network (ALINet), a system capable of producing a diverse range of high-quality assessment questions from lecture content. We inform the design of our system through a series of automated experiments using public datasets and evaluate it with a user study conducted on students and educators. Our results indicate a generally positive perception of the system’s performance, particularly in generating natural and clear questions relevant to the taught content, demonstrating its potential as a valuable resource in educational settings. This project lays the foundation for future research in multimodal educational question generation and is available for reuse in our public repository.

References

[1]
Max Bartolo, Alastair Roberts, Johannes Welbl, Sebastian Riedel, and Pontus Stenetorp. 2020. Beat the AI: Investigating adversarial human annotation for reading comprehension. Transactions of the Association for Computational Linguistics 8 (2020), 662–678.
[2]
Ayan Kumar Bhowmick, Ashish Jagmohan, Aditya Vempaty, Prasenjit Dey, Leigh Hall, Jeremy Hartman, Ravi Kokku, and Hema Maheshwari. 2023. Automating question generation from educational text. In International Conference on Innovative Techniques and Applications of Artificial Intelligence. Springer, 437–450.
[3]
Benjamin S Bloom, Max D Engelhart, EJ Furst, Walker H Hill, and David R Krathwohl. 1956. Handbook I: cognitive domain. New York: David McKay (1956), 483–498.
[4]
Guanliang Chen, Jie Yang, Claudia Hauff, and Geert-Jan Houben. 2018. LearningQ: a large-scale dataset for educational question generation. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 12.
[5]
Simon Cross, Denise Whitelock, and Jenna Mittelmeier. 2016. Does the quality and quantity of exam revision impact on student satisfaction and performance in the exam itself?: Perspectives from undergraduate distance learners. (2016).
[6]
Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, and Danqi Chen. 2019. MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension. In Proceedings of the 2nd Workshop on Machine Reading for Question Answering, Adam Fisch, Alon Talmor, Robin Jia, Minjoon Seo, Eunsol Choi, and Danqi Chen (Eds.). Association for Computational Linguistics, Hong Kong, China, 1–13. https://doi.org/10.18653/v1/D19-5801
[7]
Mark Gales, Steve Young, 2008. The application of hidden Markov models in speech recognition. Foundations and Trends® in Signal Processing 1, 3 (2008), 195–304.
[8]
Sanchit Gandhi, Patrick von Platen, and Alexander M Rush. 2023. Distil-whisper: Robust knowledge distillation via large-scale pseudo labelling. arXiv preprint arXiv:2311.00430 (2023).
[9]
R. Goyal, P. Kumar, and V. P. Singh. 2023. Automated Question and Answer Generation from Texts using Text-to-Text Transformers. Arab Journal of Science and Engineering (2023). https://doi.org/10.1007/s13369-023-07840-7
[10]
Jing Gu, Mostafa Mirshekari, Zhou Yu, and Aaron Sisto. 2021. Chaincqg: Flow-aware conversational question generation. arXiv preprint arXiv:2102.02864 (2021).
[11]
Samuel C Karpen. 2018. The Social Psychology of Biased Self-Assessment. American Journal of Pharmaceutical Education 82, 5 (2018), 6299.
[12]
Tomáš Kočiskỳ, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, and Edward Grefenstette. 2018. The narrativeqa reading comprehension challenge. Transactions of the Association for Computational Linguistics 6 (2018), 317–328.
[13]
Kettip Kriangchaivech and Artit Wangperawong. 2019. Question generation by transformers. arXiv preprint arXiv:1909.05017 (2019).
[14]
Devang Kulshreshtha, Robert Belfer, Iulian Serban, and Siva Reddy. 2021. Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval. In Conference on Empirical Methods in Natural Language Processing. https://api.semanticscholar.org/CorpusID:233295982
[15]
Devang Kulshreshtha, Muhammad Shayan, Robert Belfer, Siva Reddy, Iulian Vlad Serban, and Ekaterina Kochmar. 2022. Few-shot question generation for personalized feedback in intelligent tutoring systems. In PAIS 2022. IOS Press, 17–30.
[16]
Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs’ ka, Xiang’Anthony’ Chen, and Caiming Xiong. 2022. Discord questions: A computational approach to diversity analysis in news coverage. arXiv preprint arXiv:2211.05007 (2022).
[17]
Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs’ka, Wenhao Liu, and Caiming Xiong. 2022. Quiz Design Task: Helping Teachers Create Quizzes with Automated Question Generation. In NAACL-HLT. https://api.semanticscholar.org/CorpusID:248512983
[18]
Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. 2017. RACE: Large-scale ReAding Comprehension Dataset From Examinations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 785–794. https://doi.org/10.18653/v1/D17-1082
[19]
Chia-Hsuan Lee, Szu-Lin Wu, Chi-Liang Liu, and Hung yi Lee. 2018. Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension. In Proc. Interspeech 2018. 3459–3463. https://doi.org/10.21437/Interspeech.2018-1714
[20]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
[21]
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
[22]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.
[23]
David Lindberg, Fred Popowich, John Nesbit, and Phil Winne. 2013. Generating natural language questions to support learning on-line. In Proceedings of the 14th European workshop on natural language generation. 105–114.
[24]
Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, and Charibeth Cheng. 2020. Transformer-based end-to-end question generation. arXiv preprint arXiv:2005.01107 4 (2020).
[25]
C. L. Cooper M. Y. Tytherleigh *, C. Webb and C. Ricketts. 2005. Occupational stress in UK higher education institutions: a comparative study of all staff categories. Higher Education Research & Development 24, 1 (2005), 41–61.
[26]
Christopher A McKay, Juan Razo, and Adam M Persky. 2019. The Self-Assessment of Pharmacy Students: A Mixed-Methods Study. American Journal of Pharmaceutical Education 83, 9 (2019), 7323.
[27]
Mitchell J. Nathan and Anthony Petrosino. 2003. Expert Blind Spot Among Preservice Teachers. American Educational Research Journal 40, 4 (2003), 905–928. https://doi.org/10.3102/00028312040004905
[28]
Huy A. Nguyen, Shravya Bhat, Steven Moore, Norman Bier, and John Stamper. 2022. Towards Generalized Methods for Automatic Question Generation in Educational Domains. In Educating for a New Future: Making Sense of Technology-Enhanced Learning Adoption, Isabel Hilliger, Pedro J. Muñoz-Merino, Tinne De Laet, Alejandro Ortega-Arranz, and Tracie Farrell (Eds.). Springer International Publishing, Cham, 272–284.
[29]
Rodney D Nielsen, Jason Buckingham, Gary Knoll, Ben Marsh, and Leysia Palen. 2008. A taxonomy of questions for question generation. In Proceedings of the workshop on the question generation shared task and evaluation challenge.
[30]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.
[31]
Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning. PMLR, 28492–28518.
[32]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1–67. http://jmlr.org/papers/v21/20-074.html
[33]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100,000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Jian Su, Kevin Duh, and Xavier Carreras (Eds.). Association for Computational Linguistics, Austin, Texas, 2383–2392. https://doi.org/10.18653/v1/D16-1264
[34]
Kathleen Smithers, Nerida Spina, Jess Harris, and Sarah Gurr. 2023. Working every weekend: The paradox of time for insecurely employed academics. Time & Society 32, 1 (01 Feb 2023), 101–122.
[35]
UKRI. 2023. Framework for responsible research and innovation. https://www.ukri.org/who-we-are/epsrc/our-policies-and-standards/framework-for-responsible-innovation/
[36]
Hei-Chia Wang, Martinus Maslim, and Chia-Hao Kan. 2023. A question–answer generation system for an asynchronous distance learning platform. Education and Information Technologies (2023), 1–30.
[37]
Johannes Welbl, Nelson F. Liu, and Matt Gardner. 2017. Crowdsourcing Multiple Choice Science Questions. In Proceedings of the 3rd Workshop on Noisy User-generated Text, Leon Derczynski, Wei Xu, Alan Ritter, and Tim Baldwin (Eds.). Association for Computational Linguistics, Copenhagen, Denmark, 94–106. https://doi.org/10.18653/v1/W17-4413
[38]
Brenda W Yang, Juan Razo, and Adam M Persky. 2019. Using Testing as a Learning Tool. American Journal of Pharmaceutical Education 83, 9 (2019), 7324.
[39]
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=SkeHuCVFDr

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
TAS '24: Proceedings of the Second International Symposium on Trustworthy Autonomous Systems
September 2024
335 pages
ISBN:9798400709890
DOI:10.1145/3686038
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2024

Check for updates

Author Tags

  1. Educational Question Generation
  2. Generative AI
  3. Large Language Models

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

TAS '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 119
    Total Downloads
  • Downloads (Last 12 months)119
  • Downloads (Last 6 weeks)53
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media