[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3485447.3512260acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
short-paper

On Explaining Multimodal Hateful Meme Detection Models

Published: 25 April 2022 Publication History

Abstract

Hateful meme detection is a new multimodal task that has gained significant traction in academic and industry research communities. Recently, researchers have applied pre-trained visual-linguistic models to perform the multimodal classification task, and some of these solutions have yielded promising results. However, what these visual-linguistic models learn for the hateful meme classification task remains unclear. For instance, it is unclear if these models are able to capture the derogatory or slurs references in multimodality (i.e., image and text) of the hateful memes. To fill this research gap, this paper propose three research questions to improve our understanding of these visual-linguistic models performing the hateful meme classification task. We found that the image modality contributes more to the hateful meme classification task, and the visual-linguistic models are able to perform visual-text slurs grounding to a certain extent. Our error analysis also shows that the visual-linguistic models have acquired biases, which resulted in false-positive predictions.

References

[1]
Marco Ancona, Enea Ceolini, Cengiz Öztireli, and Markus Gross. 2018. Towards better understanding of gradient-based attribution methods for Deep Neural Networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings. OpenReview. net.
[2]
Pinkesh Badjatiya, Manish Gupta, and Vasudeva Varma. 2019. Stereotypical bias removal for hate speech detection task using knowledge-based generalizations. In The World Wide Web Conference. 49–59.
[3]
Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, and Jingjing Liu. 2020. Behind the scene: Revealing the secrets of pre-trained vision-and-language models. In European Conference on Computer Vision. Springer, 565–580.
[4]
Kevin Clark, Urvashi Khandelwal, Omer Levy, and Christopher D Manning. 2019. What Does BERT Look at? An Analysis of BERT’s Attention. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 276–286.
[5]
Thomas Davidson, Debasmita Bhattacharya, and Ingmar Weber. 2019. Racial Bias in Hate Speech and Abusive Language Detection Datasets. In Proceedings of the Third Workshop on Abusive Language Online. 25–35.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).
[7]
Stella Frank, Emanuele Bugliarello, and Desmond Elliott. 2021. Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 9847–9857.
[8]
Raul Gomez, Jaume Gibert, Lluis Gomez, and Dimosthenis Karatzas. 2020. Exploring hate speech detection in multimodal publications. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1470–1478.
[9]
Yash Goyal, Tejas Khot, Douglas Summers-Stay, Dhruv Batra, and Devi Parikh. 2017. Making the v in vqa matter: Elevating the role of image understanding in visual question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6904–6913.
[10]
Yash Goyal, Akrit Mohapatra, Devi Parikh, and Dhruv Batra. 2016. Towards transparent ai systems: Interpreting visual question answering models. arXiv preprint arXiv:1608.08974(2016).
[11]
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51, 5 (2018), 1–42.
[12]
Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, and Xiang Ren. 2020. Contextualizing Hate Speech Classifiers with Post-hoc Explanation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5435–5442.
[13]
Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes. Advances in Neural Information Processing Systems 33 (2020).
[14]
Goro Kobayashi, Tatsuki Kuribayashi, Sho Yokoi, and Kentaro Inui. 2020. Attention is not only a weight: Analyzing transformers with vector norms. arXiv preprint arXiv:2004.10102(2020).
[15]
Narine Kokhlikyan, Vivek Miglani, Miguel Martin, Edward Wang, Bilal Alsallakh, Jonathan Reynolds, Alexander Melnikov, Natalia Kliushkina, Carlos Araya, Siqi Yan, 2020. Captum: A unified and generic model interpretability library for pytorch. arXiv preprint arXiv:2009.07896(2020).
[16]
Roy Ka-Wei Lee, Rui Cao, Ziqing Fan, Jing Jiang, and Wen-Haw Chong. 2021. Disentangling Hate in Online Memes. In Proceedings of the 29th ACM International Conference on Multimedia. 5138–5147.
[17]
Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557(2019).
[18]
Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2020. What Does BERT with Vision Look At?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5265–5275.
[19]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.
[20]
Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova, and Helen Yannakoudakis. 2020. A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871(2020).
[21]
Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks. In NeurIPS.
[22]
Letitia Parcalabescu, Albert Gatt, Anette Frank, and Iacer Calixto. 2021. Seeing past words: Testing the cross-modal capabilities of pretrained v&l models on counting tasks. In Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR). 32–44.
[23]
Badri Patro, Shivansh Patel, and Vinay Namboodiri. 2020. Robust explanations for visual question answering. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1577–1586.
[24]
Shraman Pramanick, Dimitar Dimitrov, Rituparna Mukherjee, Shivam Sharma, Md Shad Akhtar, Preslav Nakov, and Tanmoy Chakraborty. 2021. Detecting Harmful Memes and Their Targets. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 2783–2796.
[25]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91–99.
[26]
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.
[27]
Piyush Sharma, Nan Ding, Sebastian Goodman, and Radu Soricut. 2018. Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2556–2565.
[28]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034(2013).
[29]
Amanpreet Singh, Vedanuj Goswami, Vivek Natarajan, Yu Jiang, Xinlei Chen, Meet Shah, Marcus Rohrbach, Dhruv Batra, and Devi Parikh. 2020. MMF: A multimodal framework for vision and language research. https://github.com/facebookresearch/mmf.
[30]
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In International Conference on Machine Learning. PMLR, 3319–3328.
[31]
Riza Velioglu and Jewgeni Rose. 2020. Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge. arXiv preprint arXiv:2012.12975(2020).
[32]
Mengzhou Xia, Anjalie Field, and Yulia Tsvetkov. 2020. Demoting Racial Bias in Hate Speech Detection. In Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media. 7–14.
[33]
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1492–1500.
[34]
Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Yejin Choi, and Noah A Smith. 2021. Challenges in Automated Debiasing for Toxic Language Detection. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 3143–3155.
[35]
Yi Zhou, Zhenhao Chen, and Huiyuan Yang. 2021. Multimodal Learning For Hateful Memes Detection. In 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 1–6.
[36]
Ron Zhu. 2020. Enhance multimodal transformer with external label and in-domain pretrain: Hateful meme challenge winning solution. arXiv preprint arXiv:2012.08290(2020).

Cited By

View all
  • (2024)AISG's Online Safety Prize Challenge: Detecting Harmful Social Bias in Multimodal MemesCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3665993(1884-1891)Online publication date: 13-May-2024
  • (2024)Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3652504(1681-1689)Online publication date: 13-May-2024
  • (2024)Understanding (Dark) Humour with Internet Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3641249(1276-1279)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. On Explaining Multimodal Hateful Meme Detection Models
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '22: Proceedings of the ACM Web Conference 2022
      April 2022
      3764 pages
      ISBN:9781450390965
      DOI:10.1145/3485447
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 April 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. explainable machine learning
      2. hate speech
      3. hateful memes
      4. multimodal

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Conference

      WWW '22
      Sponsor:
      WWW '22: The ACM Web Conference 2022
      April 25 - 29, 2022
      Virtual Event, Lyon, France

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)166
      • Downloads (Last 6 weeks)30
      Reflects downloads up to 23 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)AISG's Online Safety Prize Challenge: Detecting Harmful Social Bias in Multimodal MemesCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3665993(1884-1891)Online publication date: 13-May-2024
      • (2024)Decoding Memes: A Comprehensive Analysis of Late and Early Fusion Models for Explainable Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3652504(1681-1689)Online publication date: 13-May-2024
      • (2024)Understanding (Dark) Humour with Internet Meme AnalysisCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3641249(1276-1279)Online publication date: 13-May-2024
      • (2024)MemeCraft: Contextual and Stance-Driven Multimodal Meme GenerationProceedings of the ACM Web Conference 202410.1145/3589334.3648151(4642-4652)Online publication date: 13-May-2024
      • (2024)Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language ModelsProceedings of the ACM Web Conference 202410.1145/3589334.3645381(2359-2370)Online publication date: 13-May-2024
      • (2024)MemesViTa: A Novel Multimodal Fusion Technique for Troll Memes IdentificationIEEE Access10.1109/ACCESS.2024.350561412(177811-177828)Online publication date: 2024
      • (2024)Multimodal Hate Speech Detection in Memes Using Contrastive Language-Image Pre-TrainingIEEE Access10.1109/ACCESS.2024.336132212(22359-22375)Online publication date: 2024
      • (2024)Capturing the Concept Projection in Metaphorical Memes for Downstream Learning TasksIEEE Access10.1109/ACCESS.2023.334798812(1250-1265)Online publication date: 2024
      • (2024)CETA: Context-Enhanced and Target-Aware Hateful Meme Inference MethodNatural Language Processing and Chinese Computing10.1007/978-981-97-9443-0_8(95-106)Online publication date: 1-Nov-2024
      • (2023)Decoding the underlying meaning of multimodal hateful memesProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/665(5995-6003)Online publication date: 19-Aug-2023
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media