[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3543507.3583870acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
short-paper
Open access

Identifying Checkworthy CURE Claims on Twitter

Published: 30 April 2023 Publication History

Abstract

Medical claims on social media, if left unchecked, have the potential to directly affect the well-being of consumers of online health information. However, existing studies on claim detection do not specifically focus on medical cure aspects, neither do they address if a cure claim is “checkworthy", an indicator of whether a claim is potentially beneficial or harmful, if unchecked. In this paper, we address these limitations by compiling CW-CURE, a novel dataset of CURE tweets, namely tweets containing claims on prevention, diagnoses, risks, treatments, and cures of medical conditions. CW-CURE contains tweets on four major health conditions, namely, Alzheimer’s disease, Cancer, Diabetes, and Depression annotated for claims, their “checkworthiness", as well as the different types of claims such as quantitative claim, correlation/causation, personal experience, and future prediction. We describe our processing pipeline for compiling CW-CURE and present classification results on CURE tweets using transformer-based models. In particular, we harness claim-type information obtained with zero-shot learning to show significant improvements in checkworthiness identification. Through CW-CURE, we hope to enable research on models for effective identification and flagging of impactful CURE content, to safeguard the public’s consumption of medical content online.

References

[1]
Titipat Achakulvisut, Chandra Bhagavatula, Daniel Acuna, and Konrad Kording. 2019. Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning. arXiv preprint arXiv:1907.00962 (2019).
[2]
Javier Beltrán, Rubén Míguez, and Irene Larraz. 2021. ClaimHunter: An Unattended Tool for Automated Claim Detection on Twitter. In KnOD@WWW.
[3]
Alexandra Chronopoulou, Christos Baziotis, and Alexandros Potamianos. 2019. An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models. In NAACL-HLT. 2089–2095.
[4]
Kevin Crowston. 2012. Amazon Mechanical Turk: A Research Tool for Organizations and Information Systems Scholars. In Shaping the Future of ICT Research. Methods and Approaches, Anol Bhattacherjee and Brian Fitzgerald (Eds.).
[5]
Limeng Cui and Dongwon Lee. 2020. CoAID: COVID-19 Healthcare Misinformation Dataset. CoRR abs/2006.00885 (2020).
[6]
Johannes Daxenberger, Steffen Eger, Ivan Habernal, Christian Stab, and Iryna Gurevych. 2017. What is the Essence of a Claim¿ Cross-Domain Claim Identification. In EMNLP.
[7]
Pritam Deka, Anna Jurek-Loughrey, and Deepak P. 2022. Evidence Extraction to Validate Medical Claims in Fake News Detection. In Health Information Science, Agma Traina, Hua Wang, Yong Zhang, Siuly Siuly, Rui Zhou, and Lu Chen (Eds.).
[8]
Christine Geeng, Tiona Francisco, Jevin West, and Franziska Roesner. 2020. Social Media COVID-19 Misinformation Interventions Viewed Positively, But Have Limited Impact. CoRR abs/2012.11055 (2020).
[9]
Michael A Gisondi, Rachel Barber, Jemery Samuel Faust, Ali Raja, Matthew C Strehlow, Lauren M Westafer, and Michael Gottlieb. 2022. A Deadly Infodemic: Social Media and the Power of COVID-19 Misinformation. J Med Internet Res 24, 2 (1 Feb 2022).
[10]
Theodosis Goudas, Christos Louizos, Georgios Petasis, and Vangelis Karkaletsis. 2014. Argument Extraction from News, Blogs, and Social Media. In Artificial Intelligence: Methods and Applications, Aristidis Likas, Konstantinos Blekas, and Dimitris Kalles (Eds.).
[11]
Zhijiang Guo, Michael Schlichtkrull, and Andreas Vlachos. 2022. A Survey on Automated Fact-Checking. TACL 10 (02 2022), 178–206.
[12]
Naeemul Hassan, Fatma Arslan, Chengkai Li, and Mark Tremayne. 2017. Toward Automated Fact-Checking: Detecting Check-Worthy Factual Claims by ClaimBuster. In KDD. 1803–1812.
[13]
Naeemul Hassan, Chengkai Li, and Mark Tremayne. 2015. Detecting Check-Worthy Factual Claims in Presidential Debates. In CIKM. 1835–1838.
[14]
Tamanna Hossain, Robert L. Logan IV, Arjuna Ugarte, Yoshitomo Matsubara, Sean Young, and Sameer Singh. 2020. COVIDLies: Detecting COVID-19 Misinformation on Social Media. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020.
[15]
Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-tuning for Text Classification. In ACL. 328–339.
[16]
Shell Xu Hu, Da Li, Jan Stühmer, Minyoung Kim, and Timothy M. Hospedales. 2022. Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference. In CVPR. 9058–9067.
[17]
Kamila Janmohamed, Nathan Walter, Kate Nyhan, Kaveh Khoshnood, Joseph D Tucker, Natalie Sangngam, Frederick L. Altice, Qinglan Ding, Allie Wong, Zachary M. Schwitzky, Chris T Bauch, Munmun De Choudhury, Orestis Papakyriakopoulos, and Navin Kumar. 2021. Interventions to Mitigate COVID-19 Misinformation: A Systematic Review and Meta-Analysis. Journal of Health Communication 26, 12 (2021), 846–857.
[18]
Lev Konstantinovskiy, Oliver Price, Mevan Babakar, and Arkaitz Zubiaga. 2021. Toward Automated Factchecking: Developing an Annotation Schema and Benchmark for Consistent Automated Claim Detection. Digital Threats, Article 14 (apr 2021), 16 pages.
[19]
John Lawrence and Chris Reed. 2020. Argument Mining: A Survey. Computational Linguistics 45, 4 (01 2020), 765–818.
[20]
Marco Lippi and Paolo Torroni. 2015. Context-Independent Claim Detection for Argument Mining. In IJCAI.
[21]
Béatrice Mazoyer, Julia Cagé, Nicolas Hervé, and Céline Hudelot. 2020. A French Corpus for Event Detection on Twitter. In Proceedings of the 12th Language Resources and Evaluation Conference. 6220–6227.
[22]
Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, and Jiawei Han. 2020. Text Classification Using Label Names Only: A Language Model Self-Training Approach. In EMNLP.
[23]
Nicholas Micallef, Bing He, Srijan Kumar, Mustaque Ahamad, and Nasir Memon. 2020. The Role of the Crowd in Countering Misinformation: A Case Study of the COVID-19 Infodemic. arXiv preprint arXiv:2011.05773 (2020).
[24]
See-Kiong Ng Mingzhe Du, Sujatha Das Gollapalli. 2022. NUS-IDS at CheckThat! 2022: Identifying checkworthiness of tweets using CheckThaT5. In Working Notes of CLEF 2022—Conference and Labs of the Evaluation Forum(CLEF ’2022).
[25]
Preslav Nakov, Alberto Barrón-Cedeño, Giovanni Da San Martino, Firoj Alam, Rubén Míguez, Tommaso Caselli, Mucahid Kutlu, Wajdi Zaghouani, Chengkai Li, Shaden Shaar, Hamdy Mubarak, Alex Nikolov, Yavuz Selim Kartal, and Javier Beltrán. 2022. Overview of the CLEF-2022 CheckThat! Lab Task 1 on Identifying Relevant Claims in Tweets. In Working Notes of CLEF 2022—Conference and Labs of the Evaluation Forum.
[26]
Preslav Nakov, David P. A. Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barrón-Cedeño, Paolo Papotti, Shaden Shaar, and Giovanni Da San Martino. 2021. Automated Fact-Checking for Assisting Human Fact-Checkers. In IJCAI.
[27]
Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming First Story Detection with application to Twitter. In NAACL. 181–189.
[28]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. JMLR 21, 140 (2020), 1–67.
[29]
Tekumalla Ramya and Banda Juan M.2020. Social Media Mining Toolkit (SMMT). Genomics Inform 18, 2 (2020).
[30]
Øystein Repp and Heri Ramampiaro. 2018. Extracting News Events from Microblogs. CoRR abs/1806.07573 (2018).
[31]
Aalok Sathe, Salar Ather, Tuan Manh Le, Nathan Perry, and Joonsuk Park. 2020. Automated Fact-Checking of Claims from Wikipedia. In Proceedings of the Twelfth Language Resources and Evaluation Conference.
[32]
Timo Schick and Hinrich Schütze. 2022. True Few-Shot Learning with Prompts—A Real-World Perspective. TACL 10 (06 2022), 716–731.
[33]
Gautam Kishore Shahi, Anne Dirkson, and Tim A. Majchrzak. 2021. An exploratory study of COVID-19 misinformation on Twitter. Online Social Networks and Media 22 (2021), 100104.
[34]
Karishma Sharma, Sungyong Seo, Chuizheng Meng, Sirisha Rambhatla, Aastha Dua, and Yan Liu. 2020. Coronavirus on Social Media: Analyzing Misinformation in Twitter Conversations. CoRR abs/2003.12309 (2020).
[35]
Megha Sundriyal, Parantak Singh, Md. Shad Akhtar, Shubhashis Sengupta, and Tanmoy Chakraborty. 2021. DESYR: Definition and Syntactic Representation Based Claim Detection on the Web. In CIKM.
[36]
Karin Verspoor, Kevin Bretonnel Cohen, Michael Conway, Berry de Bruijn, Mark Dredze, Rada Mihalcea, and Byron Wallace (Eds.). 2020. Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020.
[37]
Karin Verspoor, Kevin Bretonnel Cohen, Mark Dredze, Emilio Ferrara, Jonathan May, Robert Munro, Cecile Paris, and Byron Wallace (Eds.). 2020. Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020.
[38]
David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, and Hannaneh Hajishirzi. 2020. Fact or Fiction: Verifying Scientific Claims. In EMNLP.
[39]
Amelie Wührl and Roman Klinger. 2021. Claim Detection in Biomedical Twitter Posts. In Proceedings of the 20th Workshop on Biomedical Language Processing. 131–142.
[40]
Wenpeng Yin, Jamaal Hay, and Dan Roth. 2019. Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach. In EMNLP-IJCNLP.
[41]
Ziqi Zhang, Sam Chapman, and Fabio Ciravegna. 2010. A Methodology towards Effective and Efficient Manual Document Annotation: Addressing Annotator Discrepancy and Annotation Quality. In Knowledge Engineering and Management by the Masses.
[42]
Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein. 2021. Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections. In EMNLP Findings.
[43]
Barret Zoph, Golnaz Ghiasi, Tsung-Yi Lin, Yin Cui, Hanxiao Liu, Ekin Dogus Cubuk, and Quoc Le. 2020. Rethinking Pre-training and Self-training. In NeurIPS 2020.

Cited By

View all
  • (2024)Claim detection for automated fact-checking: A survey on monolingual, multilingual and cross-lingual researchNatural Language Processing Journal10.1016/j.nlp.2024.1000667(100066)Online publication date: Jun-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '23: Proceedings of the ACM Web Conference 2023
April 2023
4293 pages
ISBN:9781450394161
DOI:10.1145/3543507
This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Check for updates

Author Tags

  1. claim detection
  2. tweet classification
  3. types of claims

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

WWW '23
Sponsor:
WWW '23: The ACM Web Conference 2023
April 30 - May 4, 2023
TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)339
  • Downloads (Last 6 weeks)37
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Claim detection for automated fact-checking: A survey on monolingual, multilingual and cross-lingual researchNatural Language Processing Journal10.1016/j.nlp.2024.1000667(100066)Online publication date: Jun-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media