[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Categorizing Sexism and Misogyny through Neural Approaches

Published: 14 June 2021 Publication History

Abstract

Sexism, an injustice that subjects women and girls to enormous suffering, manifests in blatant as well as subtle ways. In the wake of growing documentation of experiences of sexism on the web, the automatic categorization of accounts of sexism has the potential to assist social scientists and policymakers in studying and thereby countering sexism. The existing work on sexism classification has certain limitations in terms of the categories of sexism used and/or whether they can co-occur. To the best of our knowledge, this is the first work on the multi-label classification of sexism of any kind(s).1 We also consider the related task of misogyny classification. While sexism classification is performed on textual accounts describing sexism suffered or observed, misogyny classification is carried out on tweets perpetrating misogyny. We devise a novel neural framework for classifying sexism and misogyny that can combine text representations obtained using models such as Bidirectional Encoder Representations from Transformers with distributional and linguistic word embeddings using a flexible architecture involving recurrent components and optional convolutional ones. Further, we leverage unlabeled accounts of sexism to infuse domain-specific elements into our framework. To evaluate the versatility of our neural approach for tasks pertaining to sexism and misogyny, we experiment with adapting it for misogyny identification. For categorizing sexism, we investigate multiple loss functions and problem transformation techniques to address the multi-label problem formulation. We develop an ensemble approach using a proposed multi-label classification model with potentially overlapping subsets of the category set. Proposed methods outperform several deep-learning as well as traditional machine learning baselines for all three tasks.

References

[1]
Sweta Agrawal and Amit Awekar. 2018. Deep learning for detecting cyberbullying across multiple social media platforms. In Proceedings of the European Conference on Information Retrieval. Springer, 141–153.
[2]
Resham Ahluwalia, Himani Soni, Edward Callow, Anderson Nascimento, and Martine De Cock. 2018. Detecting hate speech against women in english tweets. Eval. NLP Speech Tools Ital. 12 (2018), 194.
[3]
Maria Anzovino, Elisabetta Fersini, and Paolo Rosso. 2018. Automatic identification and classification of misogynistic language on Twitter. In Proceedings of the International Conference on Applications of Natural Language to Information Systems. Springer, 57–64.
[4]
Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for computational linguistics. Comput. Linguist. 34, 4 (2008), 555–596.
[5]
Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 759–760.
[6]
Amir Bakarov. 2018. Vector space models for automatic misogyny identification. In Proceedings of 6th Evaluation Campaign of Natural Language, Processing, and Speech Tools for Italian Final Workshop (EVALITA’18). 211–213.
[7]
Angelo Basile and Chiara Rubagotti. 2018. CrotoneMilano for AMI at Evalita2018. A performant, cross-lingual misogyny detection system. Eval. NLP Speech Tools Ital. 12 (2018), 206.
[8]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5 (2017), 135–146.
[9]
Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown. 2004. Learning multi-label scene classification. Pattern Recogn. 37, 9 (2004), 1757–1771.
[10]
Pete Burnap and Matthew L. Williams. 2016. Us and them: Identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 5, 1 (2016), 11.
[11]
Davide Buscaldi. 2018. Tweetaneuse@ AMI EVALITA2018: Character-based models for the automatic misogyny identification task. Eval. NLP Speech Tools Ital. 12 (2018), 214.
[12]
Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar et al. 2018. Universal sentence encoder. Retrieved from https://arXiv:1803.11175.
[13]
Arijit Ghosh Chowdhury, Ramit Sawhney, Rajiv Shah, and Debanjan Mahata. 2019. # YouToo? Detection of personal recollections of sexual harassment on social media. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2527–2537.
[14]
Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Edu. Psychol. Measure. 20, 1 (1960), 37–46.
[15]
Alexis Conneau, Douwe Kiela, Holger Schwenk, Loïc Barrault, and Antoine Bordes. 2017. Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 670–680.
[16]
Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the 11th International AAAI Conference on Web and Social Media.
[17]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arXiv:1810.04805.
[18]
Debolina Dutta and Oishik Sircar. 2013. India’s winter of discontent: Some feminist dilemmas in the wake of a rape. Feminist Studies 39, 1 (2013), 293–306.
[19]
Jacquelynne S. Eccles, Janis E. Jacobs, and Rena D. Harold. 1990. Gender role stereotypes, expectancy effects, and parents’ socialization of gender differences. J. Soc. Issues 46, 2 (1990), 183–201.
[20]
Elisabetta Fersini, Debora Nozza, and Paolo Rosso. 2018. Overview of the evalita 2018 task on automatic misogyny identification (ami). Eval. NLP Speech Tools Ital. 12 (2018), 59.
[21]
Simona Frenda, Ghanem Bilal et al. 2018. Exploration of misogyny in Spanish and English tweets. In Proceedings of the 3rd Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval’18), Vol. 2150. Ceur Workshop Proceedings, 260–267.
[22]
Simona Frenda, Bilal Ghanem, Estefanía Guzmán-Falcón, Manuel Montes-y Gómez, Luis Villasenor-Pineda et al. 2018. Automatic expansion of lexicons for multilingual misogyny detection. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA’18), Vol. 2263. CEUR-WS, 1–6.
[23]
Lei Gao, Alexis Kuppersmith, and Ruihong Huang. 2017. Recognizing explicit and implicit hate speech using a weakly supervised two-path bootstrapping approach. In Proceedings of the 8th International Joint Conference on Natural Language Processing. 774–782.
[24]
Geoffrey E. Hinton and Ruslan R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504–507.
[25]
Akshita Jha and Radhika Mamidi. 2017. When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data. In Proceedings of the 2nd Workshop on NLP and Computational Social Science. 7–16.
[26]
Sweta Karlekar and Mohit Bansal. 2018. SafeCity: Understanding diverse forms of sexual harassment personal stories. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’18). 2805–2811.
[27]
Aparup Khatua, Erik Cambria, and Apalak Khatua. 2018. Sounds of silence breakers: Exploring sexual violence on Twitter. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM’18). 397–400.
[28]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746–1751.
[29]
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188–1196.
[30]
Richard Liao. 2017. textClassifier. Retrieved from https://github.com/richliao/textClassifier.
[31]
Margaret Mead. 1963. Sex and Temperament in Three Primitive Societies. Vol. 370. Morrow New York.
[32]
Sophie Melville, Kathryn Eccles, and Taha Yasseri. 2019. Topic modeling of everyday sexism project entries. Front. Dig. Human. 5 (2019), 28.
[33]
Nivedita Menon. 2012. Seeing Like a Feminist. Penguin UK.
[34]
Saif Mohammad. 2018. Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 174–184.
[35]
Saif M. Mohammad and Peter D. Turney. 2013. Crowdsourcing a word–emotion association lexicon. Comput. Intell. 29, 3 (2013), 436–465.
[36]
Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive language detection in online user content. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 145–153.
[37]
Debora Nozza, Claudia Volpetti, and Elisabetta Fersini. 2019. Unintended bias in misogyny detection. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. 149–155.
[38]
Endang Wahyu Pamungkas, Valerio Basile, and Viviana Patti. 2020. Misogyny detection in Twitter: A multilingual and cross-domain study. Info. Process. Manage. 57, 6 (2020), 102360.
[39]
Endang Wahyu Pamungkas, Alessandra Teresa Cignarella, Valerio Basile, Viviana Patti et al. 2018. Automatic identification of misogyny in English and Italian tweets at EVALITA 2018 with a multilingual hate lexicon. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA’18), Vol. 2263. CEUR-WS, 1–6.
[40]
Pulkit Parikh, Harika Abburi, Pinkesh Badjatiya, Radhika Krishnan, Niyati Chhaya, Manish Gupta, and Vasudeva Varma. 2019. Multi-label categorization of accounts of sexism using a neural framework. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 1642–1652.
[41]
Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In Proceedings of the International Conference on Machine Learning. 1310–1318.
[42]
Nikhil Pattisapu, Manish Gupta, Ponnurangam Kumaraguru, and Vasudeva Varma. 2017. Medical persona classification in social media. In Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, 377–384.
[43]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12 (2011), 2825–2830.
[44]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532–1543.
[45]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of North American Chapter of the Association for Computational Linguistics (NAACL-HLT’18). 2227–2237.
[46]
Jing Qian, Mai ElSherief, Elizabeth Belding, and William Yang Wang. 2018. Hierarchical CVAE for fine-grained hate speech classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 3550–3559.
[47]
Marta Recasens, Cristian Danescu-Niculescu-Mizil, and Dan Jurafsky. 2013. Linguistic models for analyzing and detecting biased language. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 1. 1650–1659.
[48]
Punyajoy Saha, Binny Mathew, Pawan Goyal, and Animesh Mukherjee. 2018. Hateminers: Detecting hate speech against women. Retrieved from https://arXiv:1812.06700.
[49]
Nicolas Schrading, Cecilia Ovesdotter Alm, Raymond Ptucha, and Christopher Homan. 2015. An analysis of domestic abuse discourse on Reddit. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2577–2583.
[50]
H. Andrew Schwartz, Maarten Sap, Margaret L. Kern, Johannes C. Eichstaedt, Adam Kapelner, Megha Agrawal, Eduardo Blanco, Lukasz Dziurzynski, Gregory Park, David Stillwell et al. 2016. Predicting individual well-being through the language of social media. In Proceedings of the Pacific Symposium on Biocomputing. World Scientific, 516–527.
[51]
Sima Sharifirad, Borna Jafarpour, Stan Matwin et al. 2018. Boosting text classification performance on sexist tweets by text augmentation and text generation using a combination of knowledge graphs. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW’18). 107–114.
[52]
Elena Shushkevich and John Cardiff. 2018. Misogyny detection and classification in english tweets: The experience of the ITT team. Eval. NLP Speech Tools Ital. 12 (2018), 182.
[53]
Cynthia Van Hee, Els Lefever, Ben Verhoeven, Julie Mennes, Bart Desmet, Guy De Pauw, Walter Daelemans, and Véronique Hoste. 2015. Detection and fine-grained classification of cyberbullying events. In Proceedings of International Conference on Recent Advances in Natural Language Processing (RANLP’15). 672–680.
[54]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’17). 5998–6008.
[55]
Jin Wang, Liang-Chih Yu, K. Robert Lai, and Xuejie Zhang. 2016. Dimensional sentiment analysis using a regional CNN-LSTM model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 2. 225–230.
[56]
William Warner and Julia Hirschberg. 2012. Detecting hate speech on the world wide web. In Proceedings of the 2nd Workshop on Language in Social Media. Association for Computational Linguistics, 19–26.
[57]
Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In Proceedings of the NAACL Student Research Workshop. 88–93.
[58]
Han Xiao. 2018. bert-as-service. Retrieved from https://github.com/hanxiao/bert-as-service.
[59]
Peng Yan, Linjing Li, Weiyun Chen, and Daniel Zeng. 2019. Quantum-inspired density matrix encoder for sexual harassment personal stories classification. In Proceedings of the IEEE International Conference on Intelligence and Security Informatics (ISI’19). IEEE, 218–220.
[60]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1480–1489.
[61]
Min-Ling Zhang and Zhi-Hua Zhou. 2014. A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 8 (2014), 1819–1837.
[62]
Ziqi Zhang and Lei Luo. 2018. Hate speech detection: A solved problem? The challenging case of long tail on Twitter. Semantic Web (2018), 1–21.
[63]
Chunting Zhou, Chonglin Sun, Zhiyuan Liu, and Francis Lau. 2015. A C-LSTM neural network for text classification. Retrieved from https://arXiv:1511.08630.

Cited By

View all
  • (2025)A context-aware attention and graph neural network-based multimodal framework for misogyny detectionInformation Processing & Management10.1016/j.ipm.2024.10389562:1(103895)Online publication date: Jan-2025
  • (2024)A Systematic Literature Review on Automatic Sexism Detection in Social MediaEngineering, Technology & Applied Science Research10.48084/etasr.888114:6(18178-18188)Online publication date: 2-Dec-2024
  • (2024)Sexism and misogyny as traits of police culture: Problems, red flags and solutionsInternational Journal of Police Science & Management10.1177/1461355724122873626:2(279-291)Online publication date: 7-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on the Web
ACM Transactions on the Web  Volume 15, Issue 4
November 2021
152 pages
ISSN:1559-1131
EISSN:1559-114X
DOI:10.1145/3465465
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2021
Accepted: 01 March 2021
Revised: 01 December 2020
Received: 01 May 2020
Published in TWEB Volume 15, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Sexism classification
  2. neural networks
  3. multi-label classification
  4. machine learning
  5. text classification
  6. misogyny detection
  7. misogyny classification

Qualifiers

  • Research-article
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)209
  • Downloads (Last 6 weeks)36
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2025)A context-aware attention and graph neural network-based multimodal framework for misogyny detectionInformation Processing & Management10.1016/j.ipm.2024.10389562:1(103895)Online publication date: Jan-2025
  • (2024)A Systematic Literature Review on Automatic Sexism Detection in Social MediaEngineering, Technology & Applied Science Research10.48084/etasr.888114:6(18178-18188)Online publication date: 2-Dec-2024
  • (2024)Sexism and misogyny as traits of police culture: Problems, red flags and solutionsInternational Journal of Police Science & Management10.1177/1461355724122873626:2(279-291)Online publication date: 7-Feb-2024
  • (2024)Exploring ChatGPT for identifying sexism in the communication of software developersProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3663918(400-403)Online publication date: 26-Jun-2024
  • (2024)Leveraging Domain-Specific Word Embedding and Hate Concepts in Hate Speech Detection2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650500(1-8)Online publication date: 30-Jun-2024
  • (2024)Multi-task learning neural framework for categorizing sexismComputer Speech and Language10.1016/j.csl.2023.10153583:COnline publication date: 1-Jan-2024
  • (2024)Large Language Model Cascades and Persona-Based In-Context Learning for Multilingual Sexism DetectionExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-71736-9_18(254-265)Online publication date: 9-Sep-2024
  • (2023)Cyberbullying-related Hate Speech Detection Using Shallow-to-deep LearningComputers, Materials & Continua10.32604/cmc.2023.03299374:1(2115-2131)Online publication date: 2023
  • (2023)Fem-Scale: A Data-Driven Approach for Quantifying Degree of Individual Feminism Perspective2023 IEEE International Conference on Contemporary Computing and Communications (InC4)10.1109/InC457730.2023.10263255(1-6)Online publication date: 21-Apr-2023
  • (2023)Identification of Misogyny on Social Media in Indonesian Using Bidirectional Encoder Representations From Transformers (BERT)2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)10.1109/ICAIIC57133.2023.10067106(401-406)Online publication date: 20-Feb-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media