Exploring Unsupervised Word Representations Models and Neural Networks for Informal Multilingual Text Against Covid-19 Social Media Content

Samawewl Jaballi ORCID: orcid.org/0000-0002-6096-6048^14,15,16,
Salah Zrigui¹⁷,
Manar Joundy Hazar^14,15,18,
Henri Nicolas¹⁶ &
…
Mounir Zrigui^14,15

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14796))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

176 Accesses

Abstract

The Covid-19 pandemic has catalyzed a marked upsurge in multilingual social media communication, distinctly marked by the fusion of dialectical Arabic with code-switched Latin scripts, primarily French and English. This linguistic phenomenon, primarily manifesting during the spread of the pandemic, poses intricate challenges for automated sentiment analysis due to the scarcity of appropriate training resources for such linguistically diverse contexts. Our research undertakes a systematic examination of this informal, multilingual textual production. Deviating from previous methodologies that predominantly relied on deep learning techniques supplemented with hand-crafted features, our study adopts an approach without any preprocessing steps. We rigorously evaluate a range of unsupervised word representation models, including word2vec, BERT, and M-BERT, and explore the application of deep learning classifiers, particularly Convolutional Neural Networks and Bidirectional Long Short-Term Memory networks, for processing complex multilingual data, avoiding the need for manually engineered features. The findings presented in this paper are the result of experiments conducted on the publicly available CTSA and TUNIZI datasets, as well as the combined CTSA-TUNIZI dataset. The effectiveness of these methods in analyzing multilingual and informal text is underscored, showcasing M-BERT’s remarkable proficiency in skillfully navigating and decoding the intricate and ever-changing linguistic landscape of multilingual social media discourse. This capability is crucial during global crises, providing key insights into public responses and assisting in informed pandemic management decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 47.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 59.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Learning Word Representations for Tunisian Sentiment Analysis

Sentiment Analysis of Code-Switched Tunisian Dialect: Exploring RNN-Based Techniques

TMD-NER: Turkish multi-domain named entity recognition for informal texts

Article 19 December 2023

Notes

References

Bellagha, M.L., Zrigui, M.: Speaker naming in Arabic tv programs. Int. Arab J. Inf. Technol. 19(6), 843–853 (2022)
Google Scholar
Bsir, B., Zrigui, M.: Bidirectional LSTM for author gender identification. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11055, pp. 393–402. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98443-8_36
Chapter Google Scholar
Gonzalez, J.A., Hurtado, L.F., Pla, F.: TwilBert: pre-trained deep bidirectional transformers for Spanish twitter. Neurocomputing 426, 58–69 (2021)
Article Google Scholar
Haffar, N., Ayadi, R., Hkiri, E., Zrigui, M.: Temporal ordering of events via deep neural networks. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 762–777. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_49
Chapter Google Scholar
Haffar, N., Zrigui, M.: A synergistic bidirectional LSTM and n-gram multi-channel CNN approach based on Bert and FastText for Arabic event identification. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22(11), 1–27 (2023)
Article Google Scholar
Hazar, M.J., Maraoui, M., Zrigui, M.: Recommendation system based on video processing in an e-learning platform. J. Hunan Univ. Nat. Sci. 49(6), 52–61 (2022)
Google Scholar
Jaballi, S., Hazar, M.J., Zrigui, S., Nicolas, H., Zrigui, M.: Deep bidirectional LSTM network learning-based sentiment analysis for Tunisian dialectical Facebook content during the spread of the coronavirus pandemic. In: Nguyen, N.T., et al. (eds.) ICCCI 2023. CCIS, vol. 1864, pp. 96–109. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41774-0_8
Chapter Google Scholar
Jaballi, S., Zrigui, S., Sghaier, M.A., Berchech, D., Zrigui, M.: Sentiment analysis of Tunisian users on social networks: overcoming the challenge of multilingual comments in the Tunisian dialect. In: Nguyen, N.T., Manolopoulos, Y., Chbeir, R., Kozierkiewicz, A., Trawiński, B. (eds.) ICCCI 2022, vol. 13501, pp. 176–192. Springer, Cham (2022)
Google Scholar
Mahdhaoui, H., Mars, A., Zrigui, M.: Active learning with aragpt2 for Arabic named entity recognition. In: Nguyen, N.T., et al. (eds.) ICCCI 2023. CCIS, vol. 1864, pp. 226–236. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41774-0_18
Chapter Google Scholar
Mahdhaoui, H., Mars, A., Zrigui, M.: Optimizing Arabic named entity recognition through active learning and Arabert. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–5. IEEE (2023)
Google Scholar
Mahmoud, A., Zrigui, M.: BLSTM-API: Bi-LSTM recurrent neural network-based approach for Arabic paraphrase identification. Arab. J. Sci. Eng. 46, 4163–4174 (2021)
Article Google Scholar
Mallat, S., Zouaghi, A., Hkiri, E., Zrigui, M.: Method of lexical enrichment in information retrieval system in Arabic. Int. J. Inform. Retrieval Res. (IJIRR) 3(4), 35–51 (2013)
Article Google Scholar
Mansouri, S., Charhad, M., Zrigui, M.: A heuristic approach to detect and localize text in Arabic news video. Computación y Sistemas 22(1), 75–82 (2018)
Article Google Scholar
Mdhaffar, S., Bougares, F., Esteve, Y., Hadrich-Belguith, L.: Sentiment analysis of tunisian dialects: linguistic resources and experiments. In: Third Arabic Natural Language Processing Workshop (WANLP), pp. 55–61 (2017)
Google Scholar
Merhbene, L., Zouaghi, A., Zrigui, M.: Ambiguous Arabic words disambiguation. In: 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 157–164. IEEE (2010)
Google Scholar
Nguyen, D.Q., Vu, T., Nguyen, A.T.: BERTweet: a pre-trained language model for English tweets. arXiv preprint arXiv:2005.10200 (2020)
Polignano, M., Basile, P., De Gemmis, M., Semeraro, G., Basile, V., et al.: Alberto: Italian BERT language understanding model for nlp challenging tasks based on tweets. In: CEUR Workshop Proceedings, vol. 2481, pp. 1–6. CEUR (2019)
Google Scholar
Rong, X.: word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014)
Sghaier, M.A., Zrigui, M.: Sentiment analysis for Arabic e-commerce websites. In: 2016 International Conference on Engineering & MIS (ICEMIS), pp. 1–7. IEEE (2016)
Google Scholar
Sghaier, M.A., Zrigui, M.: Rule-based machine translation from Tunisian dialect to modern standard Arabic. Procedia Comput. Sci. 176, 310–319 (2020)
Article Google Scholar
Terbeh, N., Labidi, M., Zrigui, M.: Automatic speech correction: a step to speech recognition for people with disabilities. In: Fourth International Conference on Information and Communication Technology and Accessibility (ICTA), pp. 1–6. IEEE (2013)
Google Scholar
Zouaghi, A., Zrigui, M., Antoniadis, G.: Automatic understanding of spontaneous Arabic speech - a numerical model. Trait. Autom. des Langues 49(1), 141–166 (2008). http://www.atala.org/IMG/pdf/TAL-2008-49-1-07-Zouaghi.pdf
Zrigui, S., Ayadi, R., Zouaghi, A., Zrigui, S.: ISAO: an intelligent system of opinions analysis. Res. Comput. Sci. 110, 21–30 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Sciences of Monastir, University of Monastir, Monastir, Tunisia
Samawewl Jaballi, Manar Joundy Hazar & Mounir Zrigui
Research Laboratory in Algebra Numbers Theory and Intelligent Systems, Monastir, Tunisia
Samawewl Jaballi, Manar Joundy Hazar & Mounir Zrigui
LaBRI, University of Bordeaux, Talence, France
Samawewl Jaballi & Henri Nicolas
Laboratory LIG, CS 40700, 38058, Grenoble Cedex, France
Salah Zrigui
Computer Center, University of Al-Qadisiyah, Qadisiyah, Iraq
Manar Joundy Hazar

Authors

Samawewl Jaballi
View author publications
You can also search for this author in PubMed Google Scholar
Salah Zrigui
View author publications
You can also search for this author in PubMed Google Scholar
Manar Joundy Hazar
View author publications
You can also search for this author in PubMed Google Scholar
Henri Nicolas
View author publications
You can also search for this author in PubMed Google Scholar
Mounir Zrigui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samawewl Jaballi .

Editor information

Editors and Affiliations

Wroclaw University of Science and Technology, Wroclaw, Poland
Ngoc Thanh Nguyen
University of Pau and Adour Countries, Pau, France
Richard Chbeir
Open University of Cyprus, Latsia, Cyprus
Yannis Manolopoulos
Iwate Prefectural University, Takizawa, Japan
Hamido Fujita
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Japan Advanced Institute of Science and Technology, Nomi, Japan
Le Minh Nguyen
Wrocław University of Science and Technology, Wrocław, Poland
Krystian Wojtkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jaballi, S., Zrigui, S., Hazar, M.J., Nicolas, H., Zrigui, M. (2024). Exploring Unsupervised Word Representations Models and Neural Networks for Informal Multilingual Text Against Covid-19 Social Media Content. In: Nguyen, N.T., et al. Intelligent Information and Database Systems. ACIIDS 2024. Lecture Notes in Computer Science(), vol 14796. Springer, Singapore. https://doi.org/10.1007/978-981-97-4985-0_27

Download citation

DOI: https://doi.org/10.1007/978-981-97-4985-0_27
Published: 16 July 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-4984-3
Online ISBN: 978-981-97-4985-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exploring Unsupervised Word Representations Models and Neural Networks for Informal Multilingual Text Against Covid-19 Social Media Content

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Word Representations for Tunisian Sentiment Analysis

Sentiment Analysis of Code-Switched Tunisian Dialect: Exploring RNN-Based Techniques

TMD-NER: Turkish multi-domain named entity recognition for informal texts

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Exploring Unsupervised Word Representations Models and Neural Networks for Informal Multilingual Text Against Covid-19 Social Media Content

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Learning Word Representations for Tunisian Sentiment Analysis

Sentiment Analysis of Code-Switched Tunisian Dialect: Exploring RNN-Based Techniques

TMD-NER: Turkish multi-domain named entity recognition for informal texts

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation