More Web Proxy on the site http://driver.im/

research-article

Fake News Classification Based on Subjective Language

Authors:

Caio Libanio Melo Jeronimo,

Leandro Balby Marinho,

Claudio E. C. Campelo,

Adriano Veloso,

Allan Sales da Costa MeloAuthors Info & Claims

iiWAS2019: Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services

Pages 15 - 24

https://doi.org/10.1145/3366030.3366039

Published: 22 February 2020 Publication History

Abstract

While many works investigate spread patterns of fake news in social networks, we focus on the textual content. Instead of relying on syntactic representations of documents (aka Bag of Words) as many works do, we seek more robust representations that may better differentiate fake from legitimate news. We propose to consider the subjectivity of news under the assumption that the subjectivity levels of legitimate and fake news are significantly different. For computing the subjectivity level of news, we rely on a set subjectivity lexicons built by Brazilian linguists. We then build subjectivity feature vectors for each news article by calculating the Word Mover's Distance (WMD) between the news and these lexicons considering the embedding the news words lie in, in order to classify the documents. The results demonstrate that our method is more robust than classical text classification approaches, especially in scenarios where training and test domains are different.

References

[1]

Hadeer Ahmed, Issa Traore, and Sherif Saad. 2017. Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques. In Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments, Issa Traore, Isaac Woungang, and Ahmed Awad (Eds.). Springer International Publishing, Cham, 127--138.

[2]

Hunt Allcott and Matthew Gentzkow. 2017. Social Media and Fake News in the 2016 Election. Working Paper 23089. National Bureau of Economic Research. https://doi.org/10.3386/w23089

[3]

Evelin Amorim, Marcia Cançado, and Adriano Veloso. 2018. Automated Essay Scoring in the Presence of Biased Ratings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 229--237. https://doi.org/10.18653/v1/N18-1021

[4]

Peter Bourgonje, Julian Moreno Schneider, and Georg Rehm. 2017. From Clickbait to Fake News Detection: An Approach based on Detecting the Stance of Headlines to Articles. In Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism. Association for Computational Linguistics, Copenhagen, Denmark, 84--89. https://doi.org/10.18653/v1/W17-4215

[5]

Jesse Davis and Mark Goadrich. 2006. The Relationship Between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on Machine Learning (ICML '06). ACM, New York, NY, USA, 233--240. https://doi.org/10.1145/1143844.1143874

Digital Library

[6]

Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The Rise of Social Bots. Commun. ACM 59, 7 (June 2016), 96--104. https://doi.org/10.1145/2818717

Digital Library

[7]

Benjamin Horne and Sibel Adali. 2017. This Just In: Fake News Packs A Lot In Title, Uses Simpler, Repetitive Content in Text Body, More Similar To Satire Than Real News. https://aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15772/14898

[8]

Gao Huang, Chuan Quo, Matt J. Kusner, Yu Sun, Kilian Q. Weinberger, and Fei Sha. 2016. Supervised Word Mover's Distance. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16). Curran Associates Inc., USA, 4869--4877. http://dl.acm.org/citation.cfm?id=3157382.3157641

Digital Library

[9]

Christian Janze and Marten Risius. 2017. Automatic Detection of Fake News on Social Media Platforms. (2017).

[10]

Matt J. Kusner, Yu Sun, Nicholas I. Kolkin, and Kilian Q. Weinberger. 2015. From Word Embeddings to Document Distances. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning-Volume 37 (ICML'15). JMLR.org, 957--966. http://dl.acm.org/citation.cfm?id=3045118.3045221

[11]

Changjun Lee, Jieun Shin, and Ahreum Hong. 2018. Does social media use really make people politically polarized? Direct and indirect effects of social media use on political polarization in South Korea. Telematics and Informatics 35, 1 (2018), 245--254. https://doi.org/10.1016/j.tele.2017.11.005

[12]

Regina Marchi. 2012. With Facebook, blogs, and fake news, teens reject journalistic âĂIJobjectivityâĂİ. Journal of Communication Inquiry 36, 3 (2012), 246--262.

[13]

Rada Mihalcea, Carmen Banea, and Janyce Wiebe. 2007. Learning Multilingual Subjective Language via Cross-Lingual Projections. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Association for Computational Linguistics, Prague, Czech Republic, 976--983. https://www.aclweb.org/anthology/P07-1123

[14]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013). http://dblp.uni-trier.de/db/journals/corr/corr1301.html#abs-1301-3781

[15]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations ofWords and Phrases and their Compositionality. In Advances in Neural Information Processing Systems 26, C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 3111--3119.

Digital Library

[16]

Tanushree Mitra and Eric Gilbert. 2015. CREDBANK: A Large-Scale Social Media Corpus With Associated Credibility Annotations.

[17]

Rafael A. Monteiro, Roney L. S. Santos, Thiago A. S. Pardo, Tiago A. de Almeida, Evandro E. S. Ruiz, and Oto A. Vale. 2018. Contributions to the Study of Fake News in Portuguese: New Corpus and Automatic Detection Results. In Computational Processing of the Portuguese Language. Springer International Publishing, Cham, 324--334.

[18]

S. B. Parikh and P. K. Atrey. 2018. Media-Rich Fake News Detection: A Survey. In 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). 436--441. https://doi.org/10.1109/MIPR.2018.00093

[19]

Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra Lefevre, and Rada Mihalcea. 2018. Automatic Detection of Fake News. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 3391--3401.

[20]

Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein. 2018. A Stylometric Inquiry into Hyperpartisan and Fake News. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 231--240. https://doi.org/10.18653/v1/P18-1022

[21]

Hannah Rashkin, Eunsol Choi, Jin Yea Jang, Svitlana Volkova, and Yejin Choi. 2017. Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 2931--2937. https://doi.org/10.18653/v1/D17-1317

[22]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 1135--1144. https://doi.org/10.1145/2939672.2939778

Digital Library

[23]

Takaya Saito and Marc Rehmsmeier. 2015. The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE 10, 3 (03 2015), 1--21. https://doi.org/10.1371/journal.pone.0118432

[24]

Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake News Detection on Social Media: A Data Mining Perspective. SIGKDD Explor. Newsl. 19, 1 (Sept. 2017), 22--36. https://doi.org/10.1145/3137597.3137600

Digital Library

[25]

C Silverman. 2016. Hyperpartisan Facebook Pages Are Publishing False And Misleading Information At An Alarming Rate. BuzzFeed, Nov. 16. https://www.buzzfeednews.com/article/craigsilverman/partisan-fb-pages-analysis

[26]

Eugenio Tacchini, Gabriele Ballarin, Marco L Della Vedova, Stefano Moret, and Luca de Alfaro. 2017. Some like it hoax: Automated fake news detection in social networks. arXiv preprint arXiv:1704.07506 (2017).

[27]

William Yang Wang. 2017. "Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Vancouver, Canada, 422--426. https://doi.org/10.18653/v1/P17-2067

[28]

Janyce Wiebe, Theresa Wilson, Rebecca Bruce, Matthew Bell, and Melanie Martin. 2004. Learning Subjective Language. Comput. Linguist. 30, 3 (Sept. 2004), 277--308. https://doi.org/10.1162/0891201041850885

Digital Library

Cited By

Vasconcelos LCampelo C(2024)Extracting Features from Text Flows based on Semantic Similarity for Text Classification: an Approach Inspired by Audio AnalysisJournal of the Brazilian Computer Society10.5753/jbcs.2024.375930:1(297-314)Online publication date: 25-Sep-2024
https://doi.org/10.5753/jbcs.2024.3759
Nugraha APristyanto YSetiani RBahtiar SPutra AAziza R(2024)A Comparative Study of Machine Learning Models for Detecting Fake News Content in Bahasa Indonesia Online Media2024 International Conference on Smart Computing, IoT and Machine Learning (SIML)10.1109/SIML61815.2024.10578272(43-48)Online publication date: 6-Jun-2024
https://doi.org/10.1109/SIML61815.2024.10578272
Bharadwaj PKumar YKoul A(2024)Artificial Intelligence in Fake News Detection and Analysis for Low-Resource LanguagesCongress on Smart Computing Technologies10.1007/978-981-97-5081-8_3(29-45)Online publication date: 30-Oct-2024
https://doi.org/10.1007/978-981-97-5081-8_3
Show More Cited By

Index Terms

Fake News Classification Based on Subjective Language
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Document analysis

Recommendations

Analysis of the Subjectivity Level in Fake News Fragments
WebMedia '20: Proceedings of the Brazilian Symposium on Multimedia and the Web

The widespread of fake news is increasingly worrying society and demanding approaches for mitigation. Although many approaches have been proposed to fake news detection, there is still a lack of works that deeply investigate their structure. Our study ...
Multidimensional Analysis of Fake News Spreaders on Twitter
Computational Data and Social Networks
Abstract
Social media has become a tool to spread false information with the help of its large complex network. The consequences of such misinformation could be very severe. The paper uses the Twitter conversations about the scrapping of Article 370 in ...
NewsPolyML: Multi-lingual European News Fake Assessment Dataset
MAD '24: Proceedings of the 3rd ACM International Workshop on Multimedia AI against Disinformation

With the rapid growth of social media and online platforms, the spread of disinformation has become rampant across the globe and among different languages. Detecting disinformation in non-English languages is crucial due to the global nature of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

iiWAS2019: Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services

December 2019

709 pages

ISBN:9781450371797

DOI:10.1145/3366030

Copyright © 2019 ACM.

© 2019 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

JKU: Johannes Kepler Universität Linz
@WAS: International Organization of Information Integration and Web-based Applications and Services

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

iiWAS2019

iiWAS2019: The 21st International Conference on Information Integration and Web-based Applications & Services

December 2 - 4, 2019

Munich, Germany

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
542
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)3

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Vasconcelos LCampelo C(2024)Extracting Features from Text Flows based on Semantic Similarity for Text Classification: an Approach Inspired by Audio AnalysisJournal of the Brazilian Computer Society10.5753/jbcs.2024.375930:1(297-314)Online publication date: 25-Sep-2024
https://doi.org/10.5753/jbcs.2024.3759
Nugraha APristyanto YSetiani RBahtiar SPutra AAziza R(2024)A Comparative Study of Machine Learning Models for Detecting Fake News Content in Bahasa Indonesia Online Media2024 International Conference on Smart Computing, IoT and Machine Learning (SIML)10.1109/SIML61815.2024.10578272(43-48)Online publication date: 6-Jun-2024
https://doi.org/10.1109/SIML61815.2024.10578272
Bharadwaj PKumar YKoul A(2024)Artificial Intelligence in Fake News Detection and Analysis for Low-Resource LanguagesCongress on Smart Computing Technologies10.1007/978-981-97-5081-8_3(29-45)Online publication date: 30-Oct-2024
https://doi.org/10.1007/978-981-97-5081-8_3
Barrón-Cedeño AAlam FStruß JNakov PChakraborty TElsayed TPrzybyła PCaselli TDa San Martino GHaouari FHasanain MLi CPiskorski JRuggeri FSong XSuwaileh R(2024)Overview of the CLEF-2024 CheckThat! Lab: Check-Worthiness, Subjectivity, Persuasion, Roles, Authorities, and Adversarial RobustnessExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-71908-0_2(28-52)Online publication date: 19-Sep-2024
https://doi.org/10.1007/978-3-031-71908-0_2
Barrón-Cedeño AAlam FChakraborty TElsayed TNakov PPrzybyła PStruß JHaouari FHasanain MRuggeri FSong XSuwaileh R(2024)The CLEF-2024 CheckThat! Lab: Check-Worthiness, Subjectivity, Persuasion, Roles, Authorities, and Adversarial RobustnessAdvances in Information Retrieval10.1007/978-3-031-56069-9_62(449-458)Online publication date: 23-Mar-2024
https://doi.org/10.1007/978-3-031-56069-9_62
Lim GPerrault S(2023)XAI in Automated Fact-Checking? The Benefits Are Modest and There's No One-Explanation-Fits-AllProceedings of the 35th Australian Computer-Human Interaction Conference10.1145/3638380.3638388(624-638)Online publication date: 2-Dec-2023
https://dl.acm.org/doi/10.1145/3638380.3638388
Moschopoulos VTsourma MDrosou ATzovaras D(2023)Misinformation detection based on news dispersion2023 24th International Conference on Digital Signal Processing (DSP)10.1109/DSP58604.2023.10167997(1-5)Online publication date: 11-Jun-2023
https://doi.org/10.1109/DSP58604.2023.10167997
Himdi HAssiri F(2023)Tasaheel: An Arabic Automative Textual Analysis Tool—All in OneIEEE Access10.1109/ACCESS.2023.334052011(139979-139992)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3340520
Barrón-Cedeño AAlam FCaselli TDa San Martino GElsayed TGalassi AHaouari FRuggeri FStruß JNandi RCheema GAzizov DNakov P(2023)The CLEF-2023 CheckThat! Lab: Checkworthiness, Subjectivity, Political Bias, Factuality, and AuthorityAdvances in Information Retrieval10.1007/978-3-031-28241-6_59(506-517)Online publication date: 16-Mar-2023
https://doi.org/10.1007/978-3-031-28241-6_59
Wylde VPrakash EHewage CPlatts J(2023)Ethical Challenges in the Use of Digital Technologies: AI and Big DataDigital Transformation in Policing: The Promise, Perils and Solutions10.1007/978-3-031-09691-4_3(33-58)Online publication date: 3-Jan-2023
https://doi.org/10.1007/978-3-031-09691-4_3
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents