Abstract
Harmful and toxic speech contribute to an unwelcoming online environment that suppresses participation and conversation. Efforts have focused on detecting and mitigating harmful speech; however, the mechanisms by which toxicity degrades online discussions are not well understood. This paper makes two contributions. First, to comprehensively model harmful comments, we introduce a multilingual misogyny and sexist speech detection model (https://huggingface.co/annahaz/xlm-roberta-base-misogyny-sexism-indomain-mix-bal). Second, we model the complex dynamics of online discussions as feedback loops in which harmful comments lead to negative emotions which prompt even more harmful comments. To quantify the feedback loops, we use a combination of mutual Granger causality and regression to analyze discussions on two political forums on Reddit: the moderated political forum r/Politics and the moderated neutral political forum r/NeutralPolitics. Our results suggest that harmful comments and negative emotions create self-reinforcing feedback loops in forums. Contrarily, moderation with neutral discussion appears to tip interactions into self-extinguishing feedback loops that reduce harmful speech and negative emotions. Our study sheds more light on the complex dynamics of harmful speech and the role of moderation and neutral discussion in mitigating these dynamics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alsagheer, D., Mansourifar, H., Shi, W.: Counter hate speech in social media: a survey. arXiv preprint arXiv:2203.03584 (2022)
Baumgartner, J., Zannettou, S., Keegan, B., Squire, M., Blackburn, J.: The pushshift reddit dataset. In: Proceedings of ICWSM, vol. 14, pp. 830–839 (2020)
Bhattacharya, S., et al.: Developing a multilingual annotated corpus of misogyny and aggression, pp. 158–168. ELRA, Marseille, France, May 2020. https://aclanthology.org/2020.trac-1.25
Carlson, C.R.: Misogynistic hate speech and its chilling effect on women’s free expression during the 2016 us presidential campaign. J. Hate Stud. 14, 97 (2017)
Chiril, P., Moriceau, V., Benamara, F., Mari, A., Origgi, G., Coulomb-Gully, M.: An annotated corpus for sexism detection in French tweets. In: Proceedings of LREC, pp. 1397–1403 (2020)
Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: Proceedings of ACL. ACL, July 2020. https://doi.org/10.18653/v1/2020.acl-main.747
Demszky, D., Movshovitz-Attias, D., Ko, J., Cowen, A., Nemade, G., Ravi, S.: Goemotions: a dataset of fine-grained emotions, July 2020. https://doi.org/10.18653/v1/2020.acl-main.372
Fersini, E., et al.: SemEval-2022 task 5: multimedia automatic misogyny identification. In: Proceedings of SemEval, pp. 533–549 (2022)
Fersini, E., Nozza, D., Rosso, P.: Overview of the Evalita 2018 task on automatic misogyny identification (AMI). EVALITA Eval. NLP Speech Tools Italian 12, 59 (2018)
Goel, V., Sahnan, D., Dutta, S., Bandhakavi, A., Chakraborty, T.: Hatemongers ride on echo chambers to escalate hate speech diffusion. PNAS Nexus 2(3), pgad041 (2023)
Guest, E., Vidgen, B., Mittos, A., Sastry, N., Tyson, G., Margetts, H.: An expert annotated dataset for the detection of online misogyny. In: Proceedings of EACL, pp. 1336–1350 (2021)
Hanu, L.: Unitary team. detoxify. github (2020). https://github.com/unitaryai/detoxify
Jha, A., Mamidi, R.: When does a compliment become sexist? Analysis and classification of ambivalent sexism using Twitter data. In: Proceedings of NLP+CSS, pp. 7–16 (2017)
Kramer, A.D., Guillory, J.E., Hancock, J.T.: Experimental evidence of massive-scale emotional contagion through social networks. Proc. NAS 111(24), 8788–8790 (2014)
Min, C., et al.: Finding hate speech with auxiliary emotion detection from self-training multi-label learning perspective. Inf. Fusion 96, 214–223 (2023)
Olteanu, A., Castillo, C., Boy, J., Varshney, K.: The effect of extremist violence on hateful speech online. In: Proceedings of ICWSM, vol. 12 (2018)
Salehabadi, N., Groggel, A., Singhal, M., Roy, S.S., Nilizadeh, S.: User engagement and the toxicity of tweets. arXiv preprint arXiv:2211.03856 (2022)
Stevenson, A.: Facebook admits it was used to incite violence in Myanmar. The NYT, 6 November 2018
Van Doorn, J.: Anger, feelings of revenge, and hate. Emot. Rev. 10(4), 321–322 (2018)
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceedings of NAACL SRW, pp. 88–93 (2016)
Acknowledgments
This material is based upon work supported in part by the Defense Advanced Research Projects Agency (DARPA) under Agreements No. HR00112290025 and HR001121C0168, and in part by the Air Force Office for Scientific Research (AFOSR) under contract FA9550-20-1-0224. Approved for public re- lease; distribution is unlimited.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chang, RC., May, J., Lerman, K. (2023). Feedback Loops and Complex Dynamics of Harmful Speech in Online Discussions. In: Thomson, R., Al-khateeb, S., Burger, A., Park, P., A. Pyke, A. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2023. Lecture Notes in Computer Science, vol 14161. Springer, Cham. https://doi.org/10.1007/978-3-031-43129-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-43129-6_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43128-9
Online ISBN: 978-3-031-43129-6
eBook Packages: Computer ScienceComputer Science (R0)