poster

Extremist Propaganda Tweet Classification with Deep Learning in Realistic Scenarios

Authors:

Leonardo Nizzoli,

Marco Avvenuti,

Stefano Cresci,

Maurizio TesconiAuthors Info & Claims

WebSci '19: Proceedings of the 10th ACM Conference on Web Science

Pages 203 - 204

https://doi.org/10.1145/3292522.3326050

Published: 26 June 2019 Publication History

Get Access

Abstract

In this work, we tackled the problem of the automatic classification of the extremist propaganda on Twitter, focusing on the Islamic State of Iraq and al-Sham (ISIS). We built and published several datasets, obtained by mixing 15,684 ISIS propaganda tweets with a variable number of neutral tweets, related to ISIS, and random ones, accounting for imbalances up to 1%. We considered three state-of-the-art, deep learning techniques, representative of the main current approaches to text classification, and two strong linear machine learning baselines. We compared their performance when varying the composition of the training and test sets, in order to explore different training strategies, and to evaluate the results when approaching realistic conditions. We demonstrated that a Recurrent-Convolutional Neural Network, based on pre-trained word embeddings, can reach an excellent F1 score of 0.9 on the most challenging test condition (1%-imbalance).

References

[1]

Swati Agarwal and Ashish Sureka. 2015. Using KNN and SVM based one-class classifier for detecting online radicalization on Twitter. In ICDCIT'15. Springer, 431--442.

Digital Library

Google Scholar

[2]

Michael Ashcroft, Ali Fisher, Lisa Kaati, Enghin Omer, and Nico Prucha. 2015. Detecting jihadist messages on twitter. In EISIC'15. IEEE, 161--164.

Digital Library

Google Scholar

[3]

Marco Avvenuti, Stefano Cresci, Leonardo Nizzoli, and Maurizio Tesconi. 2018. GSP (Geo-Semantic-Parsing): Geoparsing and Geotagging with machine learning on top of linked data. In ESWC'18. Springer, 17--32.

Crossref

Google Scholar

[4]

Stefano Cresci, Salvatore Minutoli, Leonardo Nizzoli, Serena Tardelli, and Maurizio Tesconi. 2019. Enriching Digital Libraries with Crowdsensed Data. In IRCDL'19. Springer, 144--158.

Google Scholar

[5]

Tiziano Fagni, Leonardo Nizzoli, Marinella Petrocchi, and Maurizio Tesconi. 2019. Six Things I Hate About You (in Italian) and Six Classification Strategies to More and More Effectively Find Them. In ITASEC'19.

Google Scholar

[6]

Emilio Ferrara, Wen-Qiang Wang, Onur Varol, Alessandro Flammini, and Aram Galstyan. 2016. Predicting online extremism, content adopters, and interaction reciprocity. In SOCINFO'16. Springer, 22--39.

Digital Library

Google Scholar

[7]

Andrew H Johnston and Gary M Weiss. 2017. Identifying Sunni extremist propaganda with deep learning. In SSCI'2017. IEEE, 1--6.

Crossref

Google Scholar

[8]

Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In AAAI'15, Vol. 333. 2267--2273.

Digital Library

Google Scholar

[9]

Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in pre-training distributed word representations. In LREC'18. ELRA.

Google Scholar

[10]

David Omand, Jamie Bartlett, and Carl Miller. 2012. Introducing social media intelligence (SOCMINT). Intelligence and National Security 27, 6 (2012), 801--823.

Crossref

Google Scholar

[11]

Sida Wang and Christopher D Manning. 2012. Baselines and bigrams: Simple, good sentiment and topic classification. In ACL'12. 90--94.

Digital Library

Google Scholar

[12]

Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In NIPS'15. 649--657.

Digital Library

Google Scholar

[13]

Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, and Bo Xu. 2016. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In COLING'16. 3485--3495.

Google Scholar

Cited By

View all

Zerrouki KBenblidia NBoussaid O(2024)Preprocessing multilingual text for the detection of extremism and radicalization in social networks using deep learningSTUDIES IN ENGINEERING AND EXACT SCIENCES10.54021/seesv5n2-5945:2(e11286)Online publication date: 29-Nov-2024
https://doi.org/10.54021/seesv5n2-594
Sen ADe SPal J(2024)Networks and Influencers in Online Propaganda Events: A Comparative Study of Three Cases in IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36537098:CSCW1(1-27)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653709
Chaudhari DPawar A(2023)Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News ArticlesBig Data and Cognitive Computing10.3390/bdcc70401757:4(175)Online publication date: 15-Nov-2023
https://doi.org/10.3390/bdcc7040175
Show More Cited By

Index Terms

Extremist Propaganda Tweet Classification with Deep Learning in Realistic Scenarios
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Predicting Tweet Retweetability during Hurricane Disasters

Twitter is a vital source for obtaining information, especially during events such as natural disasters. Users can spread information on Twitter either by crafting new posts, which are called "tweets," or by using the retweet mechanism to re-post ...
Academic Tweet Classification with Spreading activation based Label propagation algorithm using Tweet centric features
ICIA-16: Proceedings of the International Conference on Informatics and Analytics

Social network like Twitter is used by researchers and academicians to develop their professional relationship and as well it acts as a communication tool to share their research ideas, and research results. Among the enormous number of tweets, certain ...
IRA Propaganda on Twitter: Stoking Antagonism and Tweeting Local News
SMSociety '18: Proceedings of the 9th International Conference on Social Media and Society

This paper presents preliminary findings of a content analysis of tweets posted by false accounts operated by the Internet Research Agency (IRA) in St Petersburg. We relied on a historical database of tweets to retrieve 4,539 tweets posted by IRA-linked ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

WebSci '19: Proceedings of the 10th ACM Conference on Web Science

June 2019

395 pages

ISBN:9781450362023

DOI:10.1145/3292522

General Chairs:
Paolo Boldi
Università degli Studi, Milano, Italy
,
Brooke Foucault Welles
Northeastern University, Boston, USA
,
Katharina Kinder-Kurlanda
GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
,
Christo Wilson
Northeastern University, Boston, USA
,
Program Chairs:
Isabella Peters
ZBW Leibniz Information Center for Economics & Kiel University, Kiel, Germany
,
Wagner Meira
Universidade Federal de Minas Gerais, Belo Horizonte, Brazil

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2019

Check for updates

Author Tags

Qualifiers

Poster

Conference

WebSci '19

Sponsor:

SIGWEB

WebSci '19: 11th ACM Conference on Web Science

June 30 - July 3, 2019

Massachusetts, Boston, USA

Acceptance Rates

WebSci '19 Paper Acceptance Rate 41 of 130 submissions, 32%;

Overall Acceptance Rate 245 of 933 submissions, 26%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
272
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zerrouki KBenblidia NBoussaid O(2024)Preprocessing multilingual text for the detection of extremism and radicalization in social networks using deep learningSTUDIES IN ENGINEERING AND EXACT SCIENCES10.54021/seesv5n2-5945:2(e11286)Online publication date: 29-Nov-2024
https://doi.org/10.54021/seesv5n2-594
Sen ADe SPal J(2024)Networks and Influencers in Online Propaganda Events: A Comparative Study of Three Cases in IndiaProceedings of the ACM on Human-Computer Interaction10.1145/36537098:CSCW1(1-27)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653709
Chaudhari DPawar A(2023)Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News ArticlesBig Data and Cognitive Computing10.3390/bdcc70401757:4(175)Online publication date: 15-Nov-2023
https://doi.org/10.3390/bdcc7040175
Tundis AShams AMühlhäuser M(2023)From the detection towards a pyramidal classification of terrorist propagandaJournal of Information Security and Applications10.1016/j.jisa.2023.10364679:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.jisa.2023.103646
Madichetty SM SMadisetty S(2023)A RoBERTa based model for identifying the multi-modal informative tweets during disasterMultimedia Tools and Applications10.1007/s11042-023-14780-982:24(37615-37633)Online publication date: 29-Mar-2023
https://doi.org/10.1007/s11042-023-14780-9
Chaudhari DPawar A(2022)A Systematic Comparison of Machine Learning and NLP Techniques to Unveil Propaganda in Social MediaJournal of Information Technology Research10.4018/JITR.29938415:1(1-14)Online publication date: 1-Jan-2022
https://doi.org/10.4018/JITR.299384
Chaudhari DPawar ABarrón-Cedeño A(2022)H-Prop and H-Prop-News: Computational Propaganda Datasets in HindiData10.3390/data70300297:3(29)Online publication date: 28-Feb-2022
https://doi.org/10.3390/data7030029
Sofat CGill SBansal D(2022)Full/Regular Research Paper submission to (CSCI-RTCW): Multi Class Classification of Online Radicalization Using Transformer Models2022 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI58124.2022.00183(1034-1038)Online publication date: Dec-2022
https://doi.org/10.1109/CSCI58124.2022.00183
Gaikwad MAhirrao SKotecha KAbraham A(2022)Multi-Ideology Multi-Class Extremism Classification Using Deep Learning TechniquesIEEE Access10.1109/ACCESS.2022.320574410(104829-104843)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3205744
Trabelsi ZSaidi FThangaraj EVeni T(2022)A survey of extremism online content analysis and prediction techniques in twitter based on sentiment analysisSecurity Journal10.1057/s41284-022-00335-436:2(221-248)Online publication date: 18-Apr-2022
https://doi.org/10.1057/s41284-022-00335-4
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Predicting Tweet Retweetability during Hurricane Disasters

Academic Tweet Classification with Spreading activation based Label propagation algorithm using Tweet centric features

IRA Propaganda on Twitter: Stoking Antagonism and Tweeting Local News