More Web Proxy on the site http://driver.im/

short-paper

Open access

Identifying Checkworthy CURE Claims on Twitter

Authors:

Sujatha Das Gollapalli,

See-Kiong NgAuthors Info & Claims

WWW '23: Proceedings of the ACM Web Conference 2023

Pages 4015 - 4019

https://doi.org/10.1145/3543507.3583870

Published: 30 April 2023 Publication History

All formats PDF

Abstract

Medical claims on social media, if left unchecked, have the potential to directly affect the well-being of consumers of online health information. However, existing studies on claim detection do not specifically focus on medical cure aspects, neither do they address if a cure claim is “checkworthy", an indicator of whether a claim is potentially beneficial or harmful, if unchecked. In this paper, we address these limitations by compiling CW-CURE, a novel dataset of CURE tweets, namely tweets containing claims on prevention, diagnoses, risks, treatments, and cures of medical conditions. CW-CURE contains tweets on four major health conditions, namely, Alzheimer’s disease, Cancer, Diabetes, and Depression annotated for claims, their “checkworthiness", as well as the different types of claims such as quantitative claim, correlation/causation, personal experience, and future prediction. We describe our processing pipeline for compiling CW-CURE and present classification results on CURE tweets using transformer-based models. In particular, we harness claim-type information obtained with zero-shot learning to show significant improvements in checkworthiness identification. Through CW-CURE, we hope to enable research on models for effective identification and flagging of impactful CURE content, to safeguard the public’s consumption of medical content online.

References

[1]

Titipat Achakulvisut, Chandra Bhagavatula, Daniel Acuna, and Konrad Kording. 2019. Claim Extraction in Biomedical Publications using Deep Discourse Model and Transfer Learning. arXiv preprint arXiv:1907.00962 (2019).

[2]

Javier Beltrán, Rubén Míguez, and Irene Larraz. 2021. ClaimHunter: An Unattended Tool for Automated Claim Detection on Twitter. In KnOD@WWW.

[3]

Alexandra Chronopoulou, Christos Baziotis, and Alexandros Potamianos. 2019. An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models. In NAACL-HLT. 2089–2095.

[4]

Kevin Crowston. 2012. Amazon Mechanical Turk: A Research Tool for Organizations and Information Systems Scholars. In Shaping the Future of ICT Research. Methods and Approaches, Anol Bhattacherjee and Brian Fitzgerald (Eds.).

[5]

Limeng Cui and Dongwon Lee. 2020. CoAID: COVID-19 Healthcare Misinformation Dataset. CoRR abs/2006.00885 (2020).

[6]

Johannes Daxenberger, Steffen Eger, Ivan Habernal, Christian Stab, and Iryna Gurevych. 2017. What is the Essence of a Claim¿ Cross-Domain Claim Identification. In EMNLP.

[7]

Pritam Deka, Anna Jurek-Loughrey, and Deepak P. 2022. Evidence Extraction to Validate Medical Claims in Fake News Detection. In Health Information Science, Agma Traina, Hua Wang, Yong Zhang, Siuly Siuly, Rui Zhou, and Lu Chen (Eds.).

[8]

Christine Geeng, Tiona Francisco, Jevin West, and Franziska Roesner. 2020. Social Media COVID-19 Misinformation Interventions Viewed Positively, But Have Limited Impact. CoRR abs/2012.11055 (2020).

[9]

Michael A Gisondi, Rachel Barber, Jemery Samuel Faust, Ali Raja, Matthew C Strehlow, Lauren M Westafer, and Michael Gottlieb. 2022. A Deadly Infodemic: Social Media and the Power of COVID-19 Misinformation. J Med Internet Res 24, 2 (1 Feb 2022).

[10]

Theodosis Goudas, Christos Louizos, Georgios Petasis, and Vangelis Karkaletsis. 2014. Argument Extraction from News, Blogs, and Social Media. In Artificial Intelligence: Methods and Applications, Aristidis Likas, Konstantinos Blekas, and Dimitris Kalles (Eds.).

[11]

Zhijiang Guo, Michael Schlichtkrull, and Andreas Vlachos. 2022. A Survey on Automated Fact-Checking. TACL 10 (02 2022), 178–206.

[12]

Naeemul Hassan, Fatma Arslan, Chengkai Li, and Mark Tremayne. 2017. Toward Automated Fact-Checking: Detecting Check-Worthy Factual Claims by ClaimBuster. In KDD. 1803–1812.

[13]

Naeemul Hassan, Chengkai Li, and Mark Tremayne. 2015. Detecting Check-Worthy Factual Claims in Presidential Debates. In CIKM. 1835–1838.

[14]

Tamanna Hossain, Robert L. Logan IV, Arjuna Ugarte, Yoshitomo Matsubara, Sean Young, and Sameer Singh. 2020. COVIDLies: Detecting COVID-19 Misinformation on Social Media. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020.

[15]

Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-tuning for Text Classification. In ACL. 328–339.

[16]

Shell Xu Hu, Da Li, Jan Stühmer, Minyoung Kim, and Timothy M. Hospedales. 2022. Pushing the Limits of Simple Pipelines for Few-Shot Learning: External Data and Fine-Tuning Make a Difference. In CVPR. 9058–9067.

[17]

Kamila Janmohamed, Nathan Walter, Kate Nyhan, Kaveh Khoshnood, Joseph D Tucker, Natalie Sangngam, Frederick L. Altice, Qinglan Ding, Allie Wong, Zachary M. Schwitzky, Chris T Bauch, Munmun De Choudhury, Orestis Papakyriakopoulos, and Navin Kumar. 2021. Interventions to Mitigate COVID-19 Misinformation: A Systematic Review and Meta-Analysis. Journal of Health Communication 26, 12 (2021), 846–857.

[18]

Lev Konstantinovskiy, Oliver Price, Mevan Babakar, and Arkaitz Zubiaga. 2021. Toward Automated Factchecking: Developing an Annotation Schema and Benchmark for Consistent Automated Claim Detection. Digital Threats, Article 14 (apr 2021), 16 pages.

Digital Library

[19]

John Lawrence and Chris Reed. 2020. Argument Mining: A Survey. Computational Linguistics 45, 4 (01 2020), 765–818.

[20]

Marco Lippi and Paolo Torroni. 2015. Context-Independent Claim Detection for Argument Mining. In IJCAI.

[21]

Béatrice Mazoyer, Julia Cagé, Nicolas Hervé, and Céline Hudelot. 2020. A French Corpus for Event Detection on Twitter. In Proceedings of the 12th Language Resources and Evaluation Conference. 6220–6227.

[22]

Yu Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang, and Jiawei Han. 2020. Text Classification Using Label Names Only: A Language Model Self-Training Approach. In EMNLP.

[23]

Nicholas Micallef, Bing He, Srijan Kumar, Mustaque Ahamad, and Nasir Memon. 2020. The Role of the Crowd in Countering Misinformation: A Case Study of the COVID-19 Infodemic. arXiv preprint arXiv:2011.05773 (2020).

[24]

See-Kiong Ng Mingzhe Du, Sujatha Das Gollapalli. 2022. NUS-IDS at CheckThat! 2022: Identifying checkworthiness of tweets using CheckThaT5. In Working Notes of CLEF 2022—Conference and Labs of the Evaluation Forum(CLEF ’2022).

[25]

Preslav Nakov, Alberto Barrón-Cedeño, Giovanni Da San Martino, Firoj Alam, Rubén Míguez, Tommaso Caselli, Mucahid Kutlu, Wajdi Zaghouani, Chengkai Li, Shaden Shaar, Hamdy Mubarak, Alex Nikolov, Yavuz Selim Kartal, and Javier Beltrán. 2022. Overview of the CLEF-2022 CheckThat! Lab Task 1 on Identifying Relevant Claims in Tweets. In Working Notes of CLEF 2022—Conference and Labs of the Evaluation Forum.

[26]

Preslav Nakov, David P. A. Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barrón-Cedeño, Paolo Papotti, Shaden Shaar, and Giovanni Da San Martino. 2021. Automated Fact-Checking for Assisting Human Fact-Checkers. In IJCAI.

[27]

Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming First Story Detection with application to Twitter. In NAACL. 181–189.

[28]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. JMLR 21, 140 (2020), 1–67.

[29]

Tekumalla Ramya and Banda Juan M.2020. Social Media Mining Toolkit (SMMT). Genomics Inform 18, 2 (2020).

[30]

Øystein Repp and Heri Ramampiaro. 2018. Extracting News Events from Microblogs. CoRR abs/1806.07573 (2018).

[31]

Aalok Sathe, Salar Ather, Tuan Manh Le, Nathan Perry, and Joonsuk Park. 2020. Automated Fact-Checking of Claims from Wikipedia. In Proceedings of the Twelfth Language Resources and Evaluation Conference.

[32]

Timo Schick and Hinrich Schütze. 2022. True Few-Shot Learning with Prompts—A Real-World Perspective. TACL 10 (06 2022), 716–731.

[33]

Gautam Kishore Shahi, Anne Dirkson, and Tim A. Majchrzak. 2021. An exploratory study of COVID-19 misinformation on Twitter. Online Social Networks and Media 22 (2021), 100104.

[34]

Karishma Sharma, Sungyong Seo, Chuizheng Meng, Sirisha Rambhatla, Aastha Dua, and Yan Liu. 2020. Coronavirus on Social Media: Analyzing Misinformation in Twitter Conversations. CoRR abs/2003.12309 (2020).

[35]

Megha Sundriyal, Parantak Singh, Md. Shad Akhtar, Shubhashis Sengupta, and Tanmoy Chakraborty. 2021. DESYR: Definition and Syntactic Representation Based Claim Detection on the Web. In CIKM.

Digital Library

[36]

Karin Verspoor, Kevin Bretonnel Cohen, Michael Conway, Berry de Bruijn, Mark Dredze, Rada Mihalcea, and Byron Wallace (Eds.). 2020. Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020.

[37]

Karin Verspoor, Kevin Bretonnel Cohen, Mark Dredze, Emilio Ferrara, Jonathan May, Robert Munro, Cecile Paris, and Byron Wallace (Eds.). 2020. Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020.

[38]

David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, and Hannaneh Hajishirzi. 2020. Fact or Fiction: Verifying Scientific Claims. In EMNLP.

[39]

Amelie Wührl and Roman Klinger. 2021. Claim Detection in Biomedical Twitter Posts. In Proceedings of the 20th Workshop on Biomedical Language Processing. 131–142.

[40]

Wenpeng Yin, Jamaal Hay, and Dan Roth. 2019. Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach. In EMNLP-IJCNLP.

[41]

Ziqi Zhang, Sam Chapman, and Fabio Ciravegna. 2010. A Methodology towards Effective and Efficient Manual Document Annotation: Addressing Annotator Discrepancy and Annotation Quality. In Knowledge Engineering and Management by the Masses.

[42]

Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein. 2021. Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections. In EMNLP Findings.

[43]

Barret Zoph, Golnaz Ghiasi, Tsung-Yi Lin, Yin Cui, Hanxiao Liu, Ekin Dogus Cubuk, and Quoc Le. 2020. Rethinking Pre-training and Self-training. In NeurIPS 2020.

Cited By

Panchendrarajan RZubiaga A(2024)Claim detection for automated fact-checking: A survey on monolingual, multilingual and cross-lingual researchNatural Language Processing Journal10.1016/j.nlp.2024.1000667(100066)Online publication date: Jun-2024
https://doi.org/10.1016/j.nlp.2024.100066

Index Terms

Identifying Checkworthy CURE Claims on Twitter
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Information systems
  1. World Wide Web
    1. Web mining

Recommendations

Retrieving false claims on Twitter during the Russia-Ukraine conflict
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023

Nowadays, false and unverified information on social media sway individuals’ perceptions during major geo-political events and threaten the quality of the whole digital information ecosystem. Since the Russian invasion of Ukraine, several fact-checking ...
Overview of the CLAIMSCAN-2023: Uncovering Truth in Social Media through Claim Detection and Identification of Claim Spans
FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

The rapid development of online social media platforms has enabled a significant increase in content creation and information exchange, which has been extremely beneficial. These platforms, however, have also become a haven for those who spread false ...
Identifying communicator roles in twitter
WWW '12 Companion: Proceedings of the 21st International Conference on World Wide Web

Twitter has redefined the way social activities can be coordinated; used for mobilizing people during natural disasters, studying health epidemics, and recently, as a communication platform during social and political change. As a large scale system, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '23: Proceedings of the ACM Web Conference 2023

April 2023

4293 pages

ISBN:9781450394161

DOI:10.1145/3543507

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Conference

WWW '23

Sponsor:

SIGWEB

WWW '23: The ACM Web Conference 2023

April 30 - May 4, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
557
Total Downloads

Downloads (Last 12 months)339
Downloads (Last 6 weeks)37

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Panchendrarajan RZubiaga A(2024)Claim detection for automated fact-checking: A survey on monolingual, multilingual and cross-lingual researchNatural Language Processing Journal10.1016/j.nlp.2024.1000667(100066)Online publication date: Jun-2024
https://doi.org/10.1016/j.nlp.2024.100066

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents