Abstract
The emergence of generative AI tools, empowered by Large Language Models (LLMs), has shown power in generating content. The assessment of the usefulness of such content has become an interesting research question. Using prompt engineering, we assess the similarity of such contents to real literature produced by scientists. In this exploratory analysis, we prompt-engineer ChatGPT and Google Bard to generate clinical content to be compared with medical literature, and we assess the similarities of the generated contents by comparing them with biomedical literature. Our approach is to use text-mining methods to compare documents and bigrams and to use network analysis to check the centrality. The experiments demonstrated that ChatGPT outperformed Google Bard in different similarity and term network centrality methods, but both tools achieved good results compared to the baseline.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Google bard. https://bard.google.com/. Accessed 03 Aug 2023
Openai chatgpt. https://chat.openai.com/. Accessed 03 Aug 2023
Baumfeld Andre, E., et al.: The current landscape and emerging applications for real-world data in diagnostics and clinical decision support and its impact on regulatory decision making. Clinical Pharmacol. Therapeut. 112(6), 1172–1182 (2022)
Chung, J., Kamar, E., Amershi, S.: Increasing diversity while maintaining accuracy: text data generation with large language models and human interventions, pp. 575–593. Association for Computational Linguistics (ACL) (2023). https://doi.org/10.18653/v1/2023.acl-long.34
Eggmann, F., Weiger, R., Zitzmann, N.U., Blatz, M.B.: Implications of large language models such as chatGPT for dental medicine (2023). https://doi.org/10.1111/jerd.13046
Gao, C.A., et al.: Comparing scientific abstracts generated by chatGPT to real abstracts with detectors and blinded human reviewers. NPJ Digit. Med. 6(1), 75 (2023)
Hamed, A.A., Wu, X.: Improving detection of chatGPT-generated fake science using real publication text: introducing xfakebibs a supervised-learning network algorithm (2023)
Hamed, A.A., Zachara-Szymanska, M., Wu, X.: Safeguarding authenticity for mitigating the harms of generative AI: issues, research agenda, and policies for detection, fact-checking, and ethical AI. IScience (2024)
Kim, S.W., Gil, J.M.: Research paper classification systems based ON TF-IDF and LDA schemes. Human-Centric Comput. Inf. Sci. 9 (12 2019). https://doi.org/10.1186/s13673-019-0192-7
Liao, Z., Wang, J., Shi, Z., Lu, L., Tabata, H.: Revolutionary potential of chatGPT in constructing intelligent clinical decision support systems (2023). https://doi.org/10.1007/s10439-023-03288-w
Moro, A., Greco, M., Cappa, S.F.: Large languages, impossible languages and human brains. Cortex 167, 82–85 (2023). https://doi.org/10.1016/j.cortex.2023.07.003
Mu, Y., et al.: Augmenting large language model translators via translation memories, pp. 10287–10299. Association for Computational Linguistics (ACL) (2023). https://doi.org/10.18653/v1/2023.findings-acl.653
Shortliffe, E.H.: Role of evaluation throughout the life cycle of biomedical and health AI applications. BMJ Health Care Inform. 30(1), e100925 (2023). https://doi.org/10.1136/bmjhci-2023-100925
Singhal, K., et al.: Large language models encode clinical knowledge. Nature 620, 172–180 (2023). https://doi.org/10.1038/s41586-023-06291-2
Thada, V., Jaglan, V.: Comparison of Jaccard, dice, cosine similarity coefficient to find best fitness value for web retrieved documents using genetic algorithm. Int. J. Innov. Eng. Technol. 2, 202–205 (2013). http://www.dknmu.org/uploads/file/6842.pdf
Thirunavukarasu, A.J., Ting, D.S.J., Elangovan, K., Gutierrez, L., Tan, T.F., Ting, D.S.W.: Large language models in medicine (2023). https://doi.org/10.1038/s41591-023-02448-8
U.S. Food and Drug Administration: Framework for FDA’s real-world evidence program (Year of Publication). https://www.fda.gov/media/120060/download. Accessed 27 Oct 2023
Wang, G., Shen, Y., Luan, E.: Measure of centrality based on modularity matrix. Progr. Nat. Sci. 18 (2008). https://doi.org/10.1016/j.pnsc.2008.03.015
Zhang, J., Luo, Y.: Degree Centrality, Betweenness Centrality, and Closeness Centrality in Social Network. Atlantis Press (2017). https://doi.org/10.2991/msam-17.2017.68
Acknowledgements
This publication is partially supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement Sano No. 857533 and carried out within the International Research Agendas programme of the Foundation for Polish Science, co-financed by the European Union under the European Regional Development Fund. Additionally is partially created as part of the Ministry of Science and Higher Education’s initiative to support the activities of Excellence Centers established in Poland under the Horizon 2020 program based on the agreement No MEiN/2023/DIR/3796.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Klimczak, J., Abdeen Hamed, A. (2024). Quantifying Similarity: Text-Mining Approaches to Evaluate ChatGPT and Google Bard Content in Relation to BioMedical Literature. In: Franco, L., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2024. ICCS 2024. Lecture Notes in Computer Science, vol 14836. Springer, Cham. https://doi.org/10.1007/978-3-031-63775-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-63775-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63774-2
Online ISBN: 978-3-031-63775-9
eBook Packages: Computer ScienceComputer Science (R0)