Sarcasm Detection in Tamil Code-Mixed Data Using Transformers

Rajalakshmi Ratnavel¹²,
R. Gabriel Joshua¹²,
S. R. Varsini¹² &
…
M. Anand Kumar¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2046))

Included in the following conference series:

International Conference on Speech and Language Technologies for Low-resource Languages

156 Accesses

Abstract

Social media analytics has been increasingly gaining popularity due to the extensive amount of customer data it offers, benefiting businesses of all sizes, from local ventures to global brands. Analysing textual contents aids context understanding and also enables content moderation to maintain a positive user experience. Sarcasm detection in social media is essential to maintain constructive and respectful online communication, preventing misunderstandings, minimizing conflicts, and fostering a positive and inclusive digital environment. We propose a Transformer based model for sarcasm detection in Tamil code-mixed text. The model consists of two custom-designed layers: Encoder and Embedding layer. It incorporates multi-head self-attention layer and feed-forward neural networks, followed by normalisation and dropout layers. The proposed model has outperformed compared to other state-of-art models for sarcasm detection by achieving an impressive weighted \(F_1\) score of 0.77. This proposed model effectively addressed the unique challenges posed by the Tamil code-mixed text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 79.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 109.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Large Scale Study for Identification of Sarcasm in Textual Data

BERT-LSTM model for sarcasm detection in code-mixed social media post

Article 10 October 2022

Sarcasm Detection in Hindi-English Code-Mixed Tweets Using Machine Learning Algorithms

References

Hande, A., Hegde, S.U., Chakravarthi, B.R.: Multi-task learning in under-resourced Dravidian languages. J. Data Inf. Manag. 4(2), 137–165 (2022). https://doi.org/10.1007/s42488-022-00070-w
Rajalakshmi, R., Selvaraj, S., Vasudevan, P.: Hottest: hate and offensive content identification in Tamil using transformers and enhanced stemming. Comput. Speech Lang. 78, 101464 (2023a). https://doi.org/10.1016/j.csl.2022.101464
Rajalakshmi , R., Yashwant Reddy, B.: DLRG@HASOC 2019: an enhanced ensemble classifier for hate and offensive content identification. Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, vol. 2517, pp. 370–379 (2019). https://ceur-ws.org/Vol-2517/T3-26.pdf
Chakravarthi, B.R., et al.: Overview of the HASOC-DravidianCodeMix shared task on offensive language detection in Tamil and Malayalam. In: CEUR Workshop Proceedings, vol. 3159 (2021). https://ceur-ws.org/Vol-3159/T3-1.pdf
Rajalakshmi, R., Reddy, Y., Kumar, L.: DLRG@DravidianLangTech-EACL2021: transformer based approach for offensive language identification on code-mixed Tamil. In: Chakravarthi, B.R., Priyadharshini, R., Anand Kumar, M., Krishnamurthy, P., Sherly, E. (eds.), Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 357–362. Association for Computational Linguistics, Kyiv (2021). https://aclanthology.org/2021.dravidianlangtech-1.53
Rajalakshmi, R., Agrawal, R.: Borrowing likeliness ranking based on relevance factor. In: Proceedings of the 4th ACM IKDD Conferences on Data Sciences, pp. 1–2, March 2017. https://doi.org/10.1145/3041823.3067694
Rajalakshmi, R., Reddy, P., Khare, S., Ganganwar, V.: Sentimental analysis of code-mixed Hindi language. In: Congress on Intelligent Systems: Proceedings of CIS 2021, vol. 2, pp. 739–751, July 2022. https://doi.org/10.1007/978-981-16-9113-3_54
Rajalakshmi, R., Mattins, F., Srivarshan, S., Reddy, L.P.: Hate speech and offensive content identification in Hindi and Marathi language tweets using ensemble techniques. In: CEUR Workshop Proceedings, pp. 1–11 (2021)
Google Scholar
Govindan, V., Balakrishnan, V.: A machine learning approach in analyzing the effect of hyperboles using negative sentiment tweets for sarcasm detection. J. King Saud Univ. Comput. Inf. Sci. 34(8), 5110–5120 (2022). https://doi.org/10.1016/j.jksuci.2022.01.008
Rajalakshmi, R., Aravindan, C.: An effective and discriminative feature learning for URL based web page classification. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, pp. 1374–1379 (2018). https://doi.org/10.1109/SMC.2018.00240
Rajalakshmi, R.: Supervised term weighting methods for URL classification. J. Comput. Sci. 10(10), 1969–1976 (2014)
Article Google Scholar
Joshi, A., Prabhu, A., Shrivastava, M., Varma, V.: Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed text. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2482–2491 (2016)
Google Scholar
Vasantharajan, C., Thayasivam, U.: Towards offensive language identification for Tamil code-mixed YouTube comments and posts. SN Comput. Sci. 3(1) (2021). https://doi.org/10.1007/s42979-021-00977-y
Thara, S., Poornachandran, P.: Social media text analytics of Malayalam-English code-mixed using deep learning. J. Big Data 9(1) (2022). https://doi.org/10.1186/s40537-022-00594-3
Sharma, D.K., Singh, B., Agarwal, S., Pachauri, N., Alhussan, A.A., Abdallah, H.A.: SARCASM detection over social media platforms using hybrid ensemble model with fuzzy logic. Electronics 12(4) (2023). https://doi.org/10.3390/electronics12040937
Bedi, M., Kumar, S., Akhtar, M.S., Chakraborty, T.: Multi-modal sarcasm detection and humor classification in code-mixed conversations. IEEE Trans. Affect. Comput. 14(2), 1363–1375 (2023). https://doi.org/10.1109/taffc.2021.3083522
Rajalakshmi, R., Duraphe, A., Shibani, A.: DLRG@DravidianLangTech-ACL2022: abusive comment detection in tamil using multilingual transformer models. In: Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pp. 207–213, Dublin, Ireland. Association for Computational Linguistics (2022)
Google Scholar
Chakravarthi, B.R.: Hope speech detection in YouTube comments. Soc. Netw. Anal. Min. 12(1) (2022). https://doi.org/10.1007/s13278-022-00901-z
Chakravarthi, B.R., Hande, A., Ponnusamy, R., Kumaresan, P.K., Priyadharshini, R.: How can we detect homophobia and transphobia? Experiments in a multilingual code-mixed setting for social media governance. Int. J. Inf. Manag. Data Insights 2(2), 100119 (2022). https://doi.org/10.1016/j.jjimei.2022.100119
Chakravarthi, B.R., et al.: Overview of the shared task on sarcasm identification of Dravidian languages (Malayalam and Tamil) in Dravidian code-mix. In: Forum of Information Retrieval and Evaluation FIRE - 2023 (2023)
Google Scholar
Bharti, S., Naidu, R., Babu, K.: Hyperbolic feature-based sarcasm detection in Telugu conversation sentences. J. Intell. Syst. 30(1), 73–89 (2021). https://doi.org/10.1515/jisys-2018-0475
Bharti, S.K., Babu, K.S., Raman, R.: Context-based sarcasm detection in Hindi tweets. In: 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR) (2017a). https://doi.org/10.1109/icapr.2017.8593198
Potamias, R.A., Siolas, G., Stafylopatis, A.: A transformer-based approach to irony and sarcasm detection. Neural Comput. Appl. 32(23), 17309–17320 (2020). https://doi.org/10.1007/s00521-020-05102-3

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamil Nadu, India
Rajalakshmi Ratnavel, R. Gabriel Joshua & S. R. Varsini
Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India
M. Anand Kumar

Authors

Rajalakshmi Ratnavel
View author publications
You can also search for this author in PubMed Google Scholar
R. Gabriel Joshua
View author publications
You can also search for this author in PubMed Google Scholar
S. R. Varsini
View author publications
You can also search for this author in PubMed Google Scholar
M. Anand Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajalakshmi Ratnavel .

Editor information

Editors and Affiliations

National University of Ireland, Galway, Ireland
Bharathi Raja Chakravarthi
Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, Tamil Nadu, India
Bharathi B
University of Jaén, Jaén, Jaén, Spain
Miguel Ángel García Cumbreras
University of Jaén, Jaén, Jaén, Spain
Salud María Jiménez Zafra
Kongu Engineering College, Erode, Tamil Nadu, India
Malliga Subramanian
Kongu Engineering College, Erode, Tamil Nadu, India
Kogilavani Shanmugavadivel
Mohamed Bin Zayed University of Artificial Intelligence, Abu Dhabi, Abu Dhabi, United Arab Emirates
Preslav Nakov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ratnavel, R., Joshua, R.G., Varsini, S.R., Kumar, M.A. (2024). Sarcasm Detection in Tamil Code-Mixed Data Using Transformers. In: Chakravarthi, B.R., et al. Speech and Language Technologies for Low-Resource Languages. SPELLL 2023. Communications in Computer and Information Science, vol 2046. Springer, Cham. https://doi.org/10.1007/978-3-031-58495-4_32

Download citation

DOI: https://doi.org/10.1007/978-3-031-58495-4_32
Published: 24 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58494-7
Online ISBN: 978-3-031-58495-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sarcasm Detection in Tamil Code-Mixed Data Using Transformers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Large Scale Study for Identification of Sarcasm in Textual Data

BERT-LSTM model for sarcasm detection in code-mixed social media post

Sarcasm Detection in Hindi-English Code-Mixed Tweets Using Machine Learning Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Sarcasm Detection in Tamil Code-Mixed Data Using Transformers

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Large Scale Study for Identification of Sarcasm in Textual Data

BERT-LSTM model for sarcasm detection in code-mixed social media post

Sarcasm Detection in Hindi-English Code-Mixed Tweets Using Machine Learning Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation