Abstract
The Covid-19 pandemic has catalyzed a marked upsurge in multilingual social media communication, distinctly marked by the fusion of dialectical Arabic with code-switched Latin scripts, primarily French and English. This linguistic phenomenon, primarily manifesting during the spread of the pandemic, poses intricate challenges for automated sentiment analysis due to the scarcity of appropriate training resources for such linguistically diverse contexts. Our research undertakes a systematic examination of this informal, multilingual textual production. Deviating from previous methodologies that predominantly relied on deep learning techniques supplemented with hand-crafted features, our study adopts an approach without any preprocessing steps. We rigorously evaluate a range of unsupervised word representation models, including word2vec, BERT, and M-BERT, and explore the application of deep learning classifiers, particularly Convolutional Neural Networks and Bidirectional Long Short-Term Memory networks, for processing complex multilingual data, avoiding the need for manually engineered features. The findings presented in this paper are the result of experiments conducted on the publicly available CTSA and TUNIZI datasets, as well as the combined CTSA-TUNIZI dataset. The effectiveness of these methods in analyzing multilingual and informal text is underscored, showcasing M-BERT’s remarkable proficiency in skillfully navigating and decoding the intricate and ever-changing linguistic landscape of multilingual social media discourse. This capability is crucial during global crises, providing key insights into public responses and assisting in informed pandemic management decisions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bellagha, M.L., Zrigui, M.: Speaker naming in Arabic tv programs. Int. Arab J. Inf. Technol. 19(6), 843–853 (2022)
Bsir, B., Zrigui, M.: Bidirectional LSTM for author gender identification. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11055, pp. 393–402. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98443-8_36
Gonzalez, J.A., Hurtado, L.F., Pla, F.: TwilBert: pre-trained deep bidirectional transformers for Spanish twitter. Neurocomputing 426, 58–69 (2021)
Haffar, N., Ayadi, R., Hkiri, E., Zrigui, M.: Temporal ordering of events via deep neural networks. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 762–777. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_49
Haffar, N., Zrigui, M.: A synergistic bidirectional LSTM and n-gram multi-channel CNN approach based on Bert and FastText for Arabic event identification. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22(11), 1–27 (2023)
Hazar, M.J., Maraoui, M., Zrigui, M.: Recommendation system based on video processing in an e-learning platform. J. Hunan Univ. Nat. Sci. 49(6), 52–61 (2022)
Jaballi, S., Hazar, M.J., Zrigui, S., Nicolas, H., Zrigui, M.: Deep bidirectional LSTM network learning-based sentiment analysis for Tunisian dialectical Facebook content during the spread of the coronavirus pandemic. In: Nguyen, N.T., et al. (eds.) ICCCI 2023. CCIS, vol. 1864, pp. 96–109. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41774-0_8
Jaballi, S., Zrigui, S., Sghaier, M.A., Berchech, D., Zrigui, M.: Sentiment analysis of Tunisian users on social networks: overcoming the challenge of multilingual comments in the Tunisian dialect. In: Nguyen, N.T., Manolopoulos, Y., Chbeir, R., Kozierkiewicz, A., Trawiński, B. (eds.) ICCCI 2022, vol. 13501, pp. 176–192. Springer, Cham (2022)
Mahdhaoui, H., Mars, A., Zrigui, M.: Active learning with aragpt2 for Arabic named entity recognition. In: Nguyen, N.T., et al. (eds.) ICCCI 2023. CCIS, vol. 1864, pp. 226–236. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41774-0_18
Mahdhaoui, H., Mars, A., Zrigui, M.: Optimizing Arabic named entity recognition through active learning and Arabert. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–5. IEEE (2023)
Mahmoud, A., Zrigui, M.: BLSTM-API: Bi-LSTM recurrent neural network-based approach for Arabic paraphrase identification. Arab. J. Sci. Eng. 46, 4163–4174 (2021)
Mallat, S., Zouaghi, A., Hkiri, E., Zrigui, M.: Method of lexical enrichment in information retrieval system in Arabic. Int. J. Inform. Retrieval Res. (IJIRR) 3(4), 35–51 (2013)
Mansouri, S., Charhad, M., Zrigui, M.: A heuristic approach to detect and localize text in Arabic news video. Computación y Sistemas 22(1), 75–82 (2018)
Mdhaffar, S., Bougares, F., Esteve, Y., Hadrich-Belguith, L.: Sentiment analysis of tunisian dialects: linguistic resources and experiments. In: Third Arabic Natural Language Processing Workshop (WANLP), pp. 55–61 (2017)
Merhbene, L., Zouaghi, A., Zrigui, M.: Ambiguous Arabic words disambiguation. In: 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 157–164. IEEE (2010)
Nguyen, D.Q., Vu, T., Nguyen, A.T.: BERTweet: a pre-trained language model for English tweets. arXiv preprint arXiv:2005.10200 (2020)
Polignano, M., Basile, P., De Gemmis, M., Semeraro, G., Basile, V., et al.: Alberto: Italian BERT language understanding model for nlp challenging tasks based on tweets. In: CEUR Workshop Proceedings, vol. 2481, pp. 1–6. CEUR (2019)
Rong, X.: word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014)
Sghaier, M.A., Zrigui, M.: Sentiment analysis for Arabic e-commerce websites. In: 2016 International Conference on Engineering & MIS (ICEMIS), pp. 1–7. IEEE (2016)
Sghaier, M.A., Zrigui, M.: Rule-based machine translation from Tunisian dialect to modern standard Arabic. Procedia Comput. Sci. 176, 310–319 (2020)
Terbeh, N., Labidi, M., Zrigui, M.: Automatic speech correction: a step to speech recognition for people with disabilities. In: Fourth International Conference on Information and Communication Technology and Accessibility (ICTA), pp. 1–6. IEEE (2013)
Zouaghi, A., Zrigui, M., Antoniadis, G.: Automatic understanding of spontaneous Arabic speech - a numerical model. Trait. Autom. des Langues 49(1), 141–166 (2008). http://www.atala.org/IMG/pdf/TAL-2008-49-1-07-Zouaghi.pdf
Zrigui, S., Ayadi, R., Zouaghi, A., Zrigui, S.: ISAO: an intelligent system of opinions analysis. Res. Comput. Sci. 110, 21–30 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jaballi, S., Zrigui, S., Hazar, M.J., Nicolas, H., Zrigui, M. (2024). Exploring Unsupervised Word Representations Models and Neural Networks for Informal Multilingual Text Against Covid-19 Social Media Content. In: Nguyen, N.T., et al. Intelligent Information and Database Systems. ACIIDS 2024. Lecture Notes in Computer Science(), vol 14796. Springer, Singapore. https://doi.org/10.1007/978-981-97-4985-0_27
Download citation
DOI: https://doi.org/10.1007/978-981-97-4985-0_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-4984-3
Online ISBN: 978-981-97-4985-0
eBook Packages: Computer ScienceComputer Science (R0)