Abstract
Social media analytics has been increasingly gaining popularity due to the extensive amount of customer data it offers, benefiting businesses of all sizes, from local ventures to global brands. Analysing textual contents aids context understanding and also enables content moderation to maintain a positive user experience. Sarcasm detection in social media is essential to maintain constructive and respectful online communication, preventing misunderstandings, minimizing conflicts, and fostering a positive and inclusive digital environment. We propose a Transformer based model for sarcasm detection in Tamil code-mixed text. The model consists of two custom-designed layers: Encoder and Embedding layer. It incorporates multi-head self-attention layer and feed-forward neural networks, followed by normalisation and dropout layers. The proposed model has outperformed compared to other state-of-art models for sarcasm detection by achieving an impressive weighted \(F_1\) score of 0.77. This proposed model effectively addressed the unique challenges posed by the Tamil code-mixed text.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hande, A., Hegde, S.U., Chakravarthi, B.R.: Multi-task learning in under-resourced Dravidian languages. J. Data Inf. Manag. 4(2), 137–165 (2022). https://doi.org/10.1007/s42488-022-00070-w
Rajalakshmi, R., Selvaraj, S., Vasudevan, P.: Hottest: hate and offensive content identification in Tamil using transformers and enhanced stemming. Comput. Speech Lang. 78, 101464 (2023a). https://doi.org/10.1016/j.csl.2022.101464
Rajalakshmi , R., Yashwant Reddy, B.: DLRG@HASOC 2019: an enhanced ensemble classifier for hate and offensive content identification. Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation, vol. 2517, pp. 370–379 (2019). https://ceur-ws.org/Vol-2517/T3-26.pdf
Chakravarthi, B.R., et al.: Overview of the HASOC-DravidianCodeMix shared task on offensive language detection in Tamil and Malayalam. In: CEUR Workshop Proceedings, vol. 3159 (2021). https://ceur-ws.org/Vol-3159/T3-1.pdf
Rajalakshmi, R., Reddy, Y., Kumar, L.: DLRG@DravidianLangTech-EACL2021: transformer based approach for offensive language identification on code-mixed Tamil. In: Chakravarthi, B.R., Priyadharshini, R., Anand Kumar, M., Krishnamurthy, P., Sherly, E. (eds.), Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 357–362. Association for Computational Linguistics, Kyiv (2021). https://aclanthology.org/2021.dravidianlangtech-1.53
Rajalakshmi, R., Agrawal, R.: Borrowing likeliness ranking based on relevance factor. In: Proceedings of the 4th ACM IKDD Conferences on Data Sciences, pp. 1–2, March 2017. https://doi.org/10.1145/3041823.3067694
Rajalakshmi, R., Reddy, P., Khare, S., Ganganwar, V.: Sentimental analysis of code-mixed Hindi language. In: Congress on Intelligent Systems: Proceedings of CIS 2021, vol. 2, pp. 739–751, July 2022. https://doi.org/10.1007/978-981-16-9113-3_54
Rajalakshmi, R., Mattins, F., Srivarshan, S., Reddy, L.P.: Hate speech and offensive content identification in Hindi and Marathi language tweets using ensemble techniques. In: CEUR Workshop Proceedings, pp. 1–11 (2021)
Govindan, V., Balakrishnan, V.: A machine learning approach in analyzing the effect of hyperboles using negative sentiment tweets for sarcasm detection. J. King Saud Univ. Comput. Inf. Sci. 34(8), 5110–5120 (2022). https://doi.org/10.1016/j.jksuci.2022.01.008
Rajalakshmi, R., Aravindan, C.: An effective and discriminative feature learning for URL based web page classification. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, pp. 1374–1379 (2018). https://doi.org/10.1109/SMC.2018.00240
Rajalakshmi, R.: Supervised term weighting methods for URL classification. J. Comput. Sci. 10(10), 1969–1976 (2014)
Joshi, A., Prabhu, A., Shrivastava, M., Varma, V.: Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed text. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2482–2491 (2016)
Vasantharajan, C., Thayasivam, U.: Towards offensive language identification for Tamil code-mixed YouTube comments and posts. SN Comput. Sci. 3(1) (2021). https://doi.org/10.1007/s42979-021-00977-y
Thara, S., Poornachandran, P.: Social media text analytics of Malayalam-English code-mixed using deep learning. J. Big Data 9(1) (2022). https://doi.org/10.1186/s40537-022-00594-3
Sharma, D.K., Singh, B., Agarwal, S., Pachauri, N., Alhussan, A.A., Abdallah, H.A.: SARCASM detection over social media platforms using hybrid ensemble model with fuzzy logic. Electronics 12(4) (2023). https://doi.org/10.3390/electronics12040937
Bedi, M., Kumar, S., Akhtar, M.S., Chakraborty, T.: Multi-modal sarcasm detection and humor classification in code-mixed conversations. IEEE Trans. Affect. Comput. 14(2), 1363–1375 (2023). https://doi.org/10.1109/taffc.2021.3083522
Rajalakshmi, R., Duraphe, A., Shibani, A.: DLRG@DravidianLangTech-ACL2022: abusive comment detection in tamil using multilingual transformer models. In: Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages, pp. 207–213, Dublin, Ireland. Association for Computational Linguistics (2022)
Chakravarthi, B.R.: Hope speech detection in YouTube comments. Soc. Netw. Anal. Min. 12(1) (2022). https://doi.org/10.1007/s13278-022-00901-z
Chakravarthi, B.R., Hande, A., Ponnusamy, R., Kumaresan, P.K., Priyadharshini, R.: How can we detect homophobia and transphobia? Experiments in a multilingual code-mixed setting for social media governance. Int. J. Inf. Manag. Data Insights 2(2), 100119 (2022). https://doi.org/10.1016/j.jjimei.2022.100119
Chakravarthi, B.R., et al.: Overview of the shared task on sarcasm identification of Dravidian languages (Malayalam and Tamil) in Dravidian code-mix. In: Forum of Information Retrieval and Evaluation FIRE - 2023 (2023)
Bharti, S., Naidu, R., Babu, K.: Hyperbolic feature-based sarcasm detection in Telugu conversation sentences. J. Intell. Syst. 30(1), 73–89 (2021). https://doi.org/10.1515/jisys-2018-0475
Bharti, S.K., Babu, K.S., Raman, R.: Context-based sarcasm detection in Hindi tweets. In: 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR) (2017a). https://doi.org/10.1109/icapr.2017.8593198
Potamias, R.A., Siolas, G., Stafylopatis, A.: A transformer-based approach to irony and sarcasm detection. Neural Comput. Appl. 32(23), 17309–17320 (2020). https://doi.org/10.1007/s00521-020-05102-3
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ratnavel, R., Joshua, R.G., Varsini, S.R., Kumar, M.A. (2024). Sarcasm Detection in Tamil Code-Mixed Data Using Transformers. In: Chakravarthi, B.R., et al. Speech and Language Technologies for Low-Resource Languages. SPELLL 2023. Communications in Computer and Information Science, vol 2046. Springer, Cham. https://doi.org/10.1007/978-3-031-58495-4_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-58495-4_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58494-7
Online ISBN: 978-3-031-58495-4
eBook Packages: Computer ScienceComputer Science (R0)