Abstract
Nowadays, social networks have become powerful mediums of communication providing information, learning and entertainment. Unfortunately, these platforms can be sorely manipulated by vicious users sharing malicious contents. Therefore, the process of mining and analyzing such published suspicious content is a considerably challenging task that serves to fight against the online radicalization. For this purpose, we propose, in this paper, a new methodology for extracting and analyzing violent vocabulary shared on social networks with the exploration a set of natural language processing and data mining techniques. Our method relies mainly on extracting a set of profiles judged by a domain expert as extremist and non-extremist’ users. Then, we focus on their shared textual content in order to detect malicious vocabulary published within the radical context as well as their violence’ degrees. Finally, in order to evaluate the performance of our method, we resort to an expert who verifies the final list of the extracted vocabulary annotated by our method. Thus, the given results show its effectiveness as well as its efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Serrat, O.: Social network analysis. In: Serrat, O. (ed.) Serrat, O. Knowledge solutions, pp. 39–43. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-0983-9_9
Kumar, N., Srinathan, K.: Automatic keyphrase extraction from scientific documents using N-gram filtration technique. In: Proceedings of the Eighth ACM Symposium on Document Engineering, pp. 199–208 (2008)
Bednár, P.: Vocabulary matching for information extraction language. In: IEEE 15th International Symposium on Applied Machine Intelligence and Informatics (SAMI), pp. 149–152 (2017)
Rekik, A., Jamoussi, S.: Deep learning for hot topic extraction from social streams. In: Abraham, A., Haqiq, A., Alimi, A.M., Mezzour, G., Rokbani, N., Muda, A.K. (eds.) HIS 2016. AISC, vol. 552, pp. 186–197. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52941-7_19
McCormick, T.H., Lee, H., Cesare, N., Shojaie, A., Spiro, E.S.: Using Twitter for demographic and social science research: tools for data collection and processing. Sociol. Methods Res. 46(3), 390–421 (2017)
Brun, A.: Détection de thème et adaptation des modèles de langage pour la reconnaissance automatique de la parole. Ph.D. Nancy 1 (2003)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Zhang, Y., Zhang, P., Li, T., Yan, Y.: An unsupervised vocabulary selection technique for Chinese automatic speech recognition. In: Spoken Language Technology Workshop (SLT), pp. 420–425 (2016)
Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5(4), 537–550 (1994)
Maergner, P., Waibel, A., Lane, I.: Unsupervised vocabulary selection for real-time speech recognition of lectures. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4417–4420 (2012)
Abid, A., Ameur, H., Mbarek, A., et al.: An extraction and unification methodology for social networks data: an application to public security. In: Proceedings of the 19th International Conference on Information Integration and Web-based Applications and Services, pp. 176–180 (2017)
Gentry, J.: Package ‘twitteR’. http://cran.r-project.org/web/packages/twitteR/index.html. Accessed 29 Aug 2016
Sood, G.: Package ‘tuber’. http://cran.r-project.org/web/packages/tuber/index.html. Accessed 28 May 2017
Nielsen, R.: Package ‘arabicStemmeR’. http://cran.r-project.org/web/packages/arabicStemmeR/index.html. Accessed 7 Feb 2017
Hahsler, M., et al.: Package ‘arules’. http://cran.r-project.org/web/packages/arules/index.html. Accessed 7 Feb 2018
Sim, J., Wright, C.C.: The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys. Ther. 85(3), 257–268 (2005)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics, pp. 159–174 (1977)
Acknowledgements
This publication was made possible by NPRP grant #9-175-1-033 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Rekik, A., Jamoussi, S., Hamadou, A.B. (2019). Violent Vocabulary Extraction Methodology: Application to the Radicalism Detection on Social Media. In: Nguyen, N., Chbeir, R., Exposito, E., Aniorté, P., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2019. Lecture Notes in Computer Science(), vol 11684. Springer, Cham. https://doi.org/10.1007/978-3-030-28374-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-28374-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28373-5
Online ISBN: 978-3-030-28374-2
eBook Packages: Computer ScienceComputer Science (R0)