[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

An emoji-aware multitask framework for multimodal sarcasm detection

Published: 05 December 2022 Publication History

Abstract

Sarcasm is a case of implicit emotion and needs additional information like context and multimodality for better detection. But sometimes, this additional information also fails to help in sarcasm detection. For example, the utterance “Oh yes, you’ve been so helpful. Thank you so much for all your help”, said in a polite tone with a smiling face, can be understood easily as non-sarcastic because of its positive sentiment. But, if the above message is accompanied by a frustrated emoji ▪, the negative sentiment of the emoji becomes evident, and the intended sarcasm can be easily understood. Thus, in this paper, we propose the SEEmoji MUStARD, an extension of the multimodal MUStARD dataset. We annotate each utterance with relevant emoji, emoji’s sentiment, and emoji’s emotion. We propose an emoji-aware-multimodal multitask deep learning framework for sarcasm detection (i.e., primary task) and sentiment and emotion detection (i.e., secondary task) in a multimodal conversational scenario. Experimental results on the SEEmoji MUStARD show the efficacy of our proposed emoji-aware-multimodal approach for sarcasm detection over the existing models.

References

[1]
Chauhan D.S., S R D., Ekbal A., Bhattacharyya P., Sentiment and emotion help sarcasm? A multi-task learning framework for multi-modal sarcasm, sentiment and emotion analysis, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Online, 2020, pp. 4351–4360,. URL https://www.aclweb.org/anthology/2020.acl-main.401.
[2]
Hazarika D., Poria S., Gorantla S., Cambria E., Zimmermann R., Mihalcea R., Cascade: Contextual sarcasm detection in online discussion forums, 2018, arXiv preprint arXiv:1805.06413.
[3]
Kolchinski Y.A., Potts C., Representing social media users for sarcasm detection, 2018, arXiv preprint arXiv:1808.08470.
[4]
A.K. Jena, A. Sinha, R. Agarwal, C-net: Contextual network for sarcasm detection, in: Proceedings of the Second Workshop on Figurative Language Processing, 2020, pp. 61–66.
[5]
Misra R., Arora P., Sarcasm detection using hybrid neural network, 2019, arXiv preprint arXiv:1908.07414.
[6]
Razali M.S., Halin A.A., Norowi N.M., Doraisamy S.C., The importance of multimodality in sarcasm detection for sentiment analysis, in: 2017 IEEE 15th Student Conference on Research and Development, SCOReD, IEEE, 2017, pp. 56–60.
[7]
Das D., A Multimodal Approach to Sarcasm Detection on Social Media, (Ph.D. thesis) Missouri State University, 2019.
[8]
Schifanella R., de Juan P., Tetreault J., Cao L., Detecting sarcasm in multimodal social platforms, in: Proceedings of the 24th ACM International Conference on Multimedia, ACM, 2016, pp. 1136–1145.
[9]
Y. Cai, H. Cai, X. Wan, Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019, pp. 2506–2515.
[10]
Wang X., Sun X., Yang T., Wang H., Building a bridge: A method for image-text sarcasm detection without pretraining on image-text data, in: Proceedings of the First International Workshop on Natural Language Processing beyond Text, Association for Computational Linguistics, Online, 2020, pp. 19–29,. URL https://www.aclweb.org/anthology/2020.nlpbt-1.3.
[11]
Castro S., Hazarika D., Pérez-Rosas V., Zimmermann R., Mihalcea R., Poria S., Towards multimodal sarcasm detection (an _Obviously_Perfect paper), 2019, arXiv preprint arXiv:1906.01815.
[12]
Attardo S., Eisterhold J., Hay J., Poggi I., Multimodal markers of irony and sarcasm, Humor 16 (2) (2003) 243–260.
[13]
Chauhan D.S., S R D., Ekbal A., Bhattacharyya P., All-in-one: A deep attentive multi-task learning framework for humour, sarcasm, offensive, motivation, and sentiment on memes, in: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Association for Computational Linguistics, Suzhou, China, 2020, pp. 281–290. URL https://www.aclweb.org/anthology/2020.aacl-main.31.
[14]
Pan H., Lin Z., Fu P., Qi Y., Wang W., Modeling intra and inter-modality incongruity for multi-modal sarcasm detection, in: Findings of the Association for Computational Linguistics: EMNLP 2020, Association for Computational Linguistics, Online, 2020, pp. 1383–1392,. URL https://www.aclweb.org/anthology/2020.findings-emnlp.124.
[15]
Ghosal D., Akhtar M.S., Chauhan D.S., Poria S., Ekbal A., Bhattacharyya P., Contextual inter-modal attention for multi-modal sentiment analysis, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 3454–3466. URL http://www.aclweb.org/anthology/D18-1382.
[16]
Sangwan S., Chauhan D.S., Akhtar M., Ekbal A., Bhattacharyya P., et al., Multi-task gated contextual cross-modal attention framework for sentiment and emotion analysis, in: International Conference on Neural Information Processing, Springer, 2019, pp. 662–669.
[17]
Soleymani M., Garcia D., Jou B., Schuller B., Chang S.-F., Pantic M., A survey of multimodal sentiment analysis, Image Vis. Comput. 65 (2017) 3–14.
[18]
Chauhan D.S., Akhtar M.S., Ekbal A., Bhattacharyya P., Context-aware interactive attention for multi-modal sentiment and emotion analysis, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP, Association for Computational Linguistics, Hong Kong, China, 2019, pp. 5651–5661.
[19]
Chauhan D.S., Ekbal A., Bhattacharyya P., An efficient fusion mechanism for multimodal low-resource setting, in: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’22, Association for Computing Machinery, New York, NY, USA, 2022, pp. 2583–2588,.
[20]
Kaur R., Kautish S., Multimodal sentiment analysis: A survey and comparison, Int. J. Serv. Sci. Manage. Eng. Technol. (IJSSMET) 10 (2) (2019) 38–58.
[21]
Blanchard N., Moreira D., Bharati A., Scheirer W.J., Getting the subtext without the text: Scalable multimodal sentiment classification from visual and acoustic modalities, 2018, arXiv preprint arXiv:1807.01122.
[22]
V. Pérez-Rosas, R. Mihalcea, L.-P. Morency, Utterance-level multimodal sentiment analysis, in: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, 2013, pp. 973–982.
[23]
Poria S., Hazarika D., Majumder N., Naik G., Cambria E., Mihalcea R., MELD: A multimodal multi-party dataset for emotion recognition in conversations, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 527–536,. URL https://www.aclweb.org/anthology/P19-1050.
[24]
Soleymani M., Pantic M., Pun T., Multimodal emotion recognition in response to videos, IEEE Trans. Affect. Comput. 3 (2) (2011) 211–223.
[25]
Akhtar M.S., Chauhan D.S., Ghosal D., Poria S., Ekbal A., Bhattacharyya P., Multi-task learning for multi-modal emotion recognition and sentiment analysis, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, Long and Short Papers, Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 370–379,. URL https://www.aclweb.org/anthology/N19-1034.
[26]
Dai W., Liu Z., Yu T., Fung P., Modality-transferable emotion embeddings for low-resource multimodal emotion recognition, in: Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Association for Computational Linguistics, Suzhou, China, 2020, pp. 269–280. URL https://www.aclweb.org/anthology/2020.aacl-main.30.
[27]
Delbrouck J.-B., Tits N., Brousmiche M., Dupont S., A transformer-based joint-encoding for emotion recognition and sentiment analysis, in: Second Grand-Challenge and Workshop on Multimodal Language, Challenge-HML, Association for Computational Linguistics, Seattle, USA, 2020, pp. 1–7,. URL https://www.aclweb.org/anthology/2020.challengehml-1.1.
[28]
Choi W.Y., Song K.Y., Lee C.W., Convolutional attention networks for multimodal emotion recognition from speech and text data, in: Proceedings of Grand Challenge and Workshop on Human Multimodal Language, Challenge-HML, Association for Computational Linguistics, Melbourne, Australia, 2018, pp. 28–34,. URL https://www.aclweb.org/anthology/W18-3304.
[29]
D. Hazarika, S. Poria, R. Mihalcea, E. Cambria, R. Zimmermann, Icon: Interactive conversational memory network for multimodal emotion detection, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018, pp. 2594–2604.
[30]
Sharma G., Dhall A., A survey on automatic multimodal emotion recognition in the wild, in: Advances in Data Science: Methodologies and Applications, Springer, 2021, pp. 35–64.
[31]
Shoeb A.A.M., de Melo G., EmoTag1200: Understanding the association between emojis and emotions, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP, Association for Computational Linguistics, Online, 2020, pp. 8957–8967,. URL https://www.aclweb.org/anthology/2020.emnlp-main.720.
[32]
Illendula A., Sheth A., Multimodal emotion classification, in: Companion Proceedings of the 2019 World Wide Web Conference, ACM, 2019, pp. 439–449.
[33]
Barbieri F., Ballesteros M., Saggion H., Are emojis predictable?, 2017, arXiv preprint arXiv:1702.07285.
[34]
Jin S., Pedersen T., Duluth urop at semeval-2018 task 2: Multilingual emoji prediction with ensemble learning and oversampling, 2018, arXiv preprint arXiv:1805.10267.
[35]
Santhanam S., Srinivasan V., Glass S., Shaikh S., I stand with you: Using emojis to study solidarity in crisis events, 2019, arXiv preprint arXiv:1907.08326.
[36]
G. Guibon, M. Ochs, P. Bellot, Emoji recommendation in private instant messages, in: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 2018, pp. 1821–1823.
[37]
S. Wijeratne, L. Balasuriya, A. Sheth, D. Doran, Emojinet: An open service and api for emoji sense discovery, in: Eleventh International AAAI Conference on Web and Social Media, 2017.
[38]
Hussien W., Al-Ayyoub M., Tashtoush Y., Al-Kabi M., On the use of emojis to train emotion classifiers, 2019, arXiv preprint arXiv:1902.08906.
[39]
Z. Al-Halah, A. Aitken, W. Shi, J. Caballero, Smile, be happy:) emoji embedding for visual sentiment analysis, in: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2019.
[40]
Felbo B., Mislove A., Søgaard A., Rahwan I., Lehmann S., Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm, 2017, arXiv preprint arXiv:1708.00524.
[41]
Y. Chen, J. Yuan, Q. You, J. Luo, Twitter sentiment analysis via bi-sense emoji embedding and attention-based LSTM, in: Proceedings of the 26th ACM International Conference on Multimedia, 2018, pp. 117–125.
[42]
Zhou X., Wang W.Y., Mojitalk: Generating emotional responses at scale, 2017, arXiv preprint arXiv:1711.04090.
[43]
Ma W., Liu R., Wang L., Vosoughi S., Emoji prediction: Extensions and benchmarking, 2020, arXiv preprint arXiv:2007.07389.
[44]
Barbieri F., Ballesteros M., Ronzano F., Saggion H., Multimodal emoji prediction, 2018, arXiv preprint arXiv:1803.02392.
[45]
Eisner B., Rocktäschel T., Augenstein I., Bošnjak M., Riedel S., emoji2vec: Learning emoji representations from their description, 2016, arXiv preprint arXiv:1609.08359.
[46]
Joulin A., Grave E., Bojanowski P., Douze M., Jégou H., Mikolov T., FastText.zip: Compressing text classification models, 2016, arXiv preprint arXiv:1612.03651.
[47]
Cho K., van Merrienboer B., Bahdanau D., Bengio Y., On the properties of neural machine translation: Encoder-decoder approaches, 2014, CoRR, arXiv:1409.1259.
[48]
Arevalo J., Solorio T., Montes-y Gómez M., González F.A., Gated multimodal units for information fusion, 2017, arXiv preprint arXiv:1702.01992.
[49]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[50]
Fleiss J.L., Measuring nominal scale agreement among many rater, Psychol. Bull. 76 (1971) 378–382.

Cited By

View all
  • (2024)MV-BART: Multi-view BART for Multi-modal Sarcasm DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679570(3602-3611)Online publication date: 21-Oct-2024
  • (2024)An attention-based, context-aware multimodal fusion method for sarcasm detection using inter-modality inconsistencyKnowledge-Based Systems10.1016/j.knosys.2024.111457287:COnline publication date: 5-Mar-2024
  • (2024)Image-Text Sarcasm Detection for Enhanced UnderstandingPattern Recognition10.1007/978-3-031-78186-5_1(1-14)Online publication date: 1-Dec-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Knowledge-Based Systems
Knowledge-Based Systems  Volume 257, Issue C
Dec 2022
777 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 05 December 2022

Author Tags

  1. Multimodal sarcasm
  2. Multimodal sentiment
  3. Multimodal emotion
  4. Emoji
  5. MUStARD dataset
  6. Deep learning

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)MV-BART: Multi-view BART for Multi-modal Sarcasm DetectionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679570(3602-3611)Online publication date: 21-Oct-2024
  • (2024)An attention-based, context-aware multimodal fusion method for sarcasm detection using inter-modality inconsistencyKnowledge-Based Systems10.1016/j.knosys.2024.111457287:COnline publication date: 5-Mar-2024
  • (2024)Image-Text Sarcasm Detection for Enhanced UnderstandingPattern Recognition10.1007/978-3-031-78186-5_1(1-14)Online publication date: 1-Dec-2024
  • (2023)A Multi-task Model for Emotion and Offensive Aided Stance Detection of Climate Change TweetsProceedings of the ACM Web Conference 202310.1145/3543507.3583860(3948-3958)Online publication date: 30-Apr-2023
  • (2023)Multimodal hate speech detection via multi-scale visual kernels and knowledge distillation architectureEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106991126:PBOnline publication date: 1-Nov-2023

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media