[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Exploring Unsupervised Word Representations Models and Neural Networks for Informal Multilingual Text Against Covid-19 Social Media Content

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2024)

Abstract

The Covid-19 pandemic has catalyzed a marked upsurge in multilingual social media communication, distinctly marked by the fusion of dialectical Arabic with code-switched Latin scripts, primarily French and English. This linguistic phenomenon, primarily manifesting during the spread of the pandemic, poses intricate challenges for automated sentiment analysis due to the scarcity of appropriate training resources for such linguistically diverse contexts. Our research undertakes a systematic examination of this informal, multilingual textual production. Deviating from previous methodologies that predominantly relied on deep learning techniques supplemented with hand-crafted features, our study adopts an approach without any preprocessing steps. We rigorously evaluate a range of unsupervised word representation models, including word2vec, BERT, and M-BERT, and explore the application of deep learning classifiers, particularly Convolutional Neural Networks and Bidirectional Long Short-Term Memory networks, for processing complex multilingual data, avoiding the need for manually engineered features. The findings presented in this paper are the result of experiments conducted on the publicly available CTSA and TUNIZI datasets, as well as the combined CTSA-TUNIZI dataset. The effectiveness of these methods in analyzing multilingual and informal text is underscored, showcasing M-BERT’s remarkable proficiency in skillfully navigating and decoding the intricate and ever-changing linguistic landscape of multilingual social media discourse. This capability is crucial during global crises, providing key insights into public responses and assisting in informed pandemic management decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 47.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 59.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/fbougares/TSAC.

  2. 2.

    https://www.kaggle.com/naim99/tsnaimmhedhbiv2.

  3. 3.

    http://fauconnier.github.io/.

  4. 4.

    https://github.com/chaymafourati/TUNIZI-Sentiment-Analysis-Tunisian-Arabizi-Dataset.

References

  1. Bellagha, M.L., Zrigui, M.: Speaker naming in Arabic tv programs. Int. Arab J. Inf. Technol. 19(6), 843–853 (2022)

    Google Scholar 

  2. Bsir, B., Zrigui, M.: Bidirectional LSTM for author gender identification. In: Nguyen, N.T., Pimenidis, E., Khan, Z., Trawiński, B. (eds.) ICCCI 2018. LNCS (LNAI), vol. 11055, pp. 393–402. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98443-8_36

    Chapter  Google Scholar 

  3. Gonzalez, J.A., Hurtado, L.F., Pla, F.: TwilBert: pre-trained deep bidirectional transformers for Spanish twitter. Neurocomputing 426, 58–69 (2021)

    Article  Google Scholar 

  4. Haffar, N., Ayadi, R., Hkiri, E., Zrigui, M.: Temporal ordering of events via deep neural networks. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12822, pp. 762–777. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_49

    Chapter  Google Scholar 

  5. Haffar, N., Zrigui, M.: A synergistic bidirectional LSTM and n-gram multi-channel CNN approach based on Bert and FastText for Arabic event identification. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22(11), 1–27 (2023)

    Article  Google Scholar 

  6. Hazar, M.J., Maraoui, M., Zrigui, M.: Recommendation system based on video processing in an e-learning platform. J. Hunan Univ. Nat. Sci. 49(6), 52–61 (2022)

    Google Scholar 

  7. Jaballi, S., Hazar, M.J., Zrigui, S., Nicolas, H., Zrigui, M.: Deep bidirectional LSTM network learning-based sentiment analysis for Tunisian dialectical Facebook content during the spread of the coronavirus pandemic. In: Nguyen, N.T., et al. (eds.) ICCCI 2023. CCIS, vol. 1864, pp. 96–109. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41774-0_8

    Chapter  Google Scholar 

  8. Jaballi, S., Zrigui, S., Sghaier, M.A., Berchech, D., Zrigui, M.: Sentiment analysis of Tunisian users on social networks: overcoming the challenge of multilingual comments in the Tunisian dialect. In: Nguyen, N.T., Manolopoulos, Y., Chbeir, R., Kozierkiewicz, A., Trawiński, B. (eds.) ICCCI 2022, vol. 13501, pp. 176–192. Springer, Cham (2022)

    Google Scholar 

  9. Mahdhaoui, H., Mars, A., Zrigui, M.: Active learning with aragpt2 for Arabic named entity recognition. In: Nguyen, N.T., et al. (eds.) ICCCI 2023. CCIS, vol. 1864, pp. 226–236. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41774-0_18

    Chapter  Google Scholar 

  10. Mahdhaoui, H., Mars, A., Zrigui, M.: Optimizing Arabic named entity recognition through active learning and Arabert. In: 2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA), pp. 1–5. IEEE (2023)

    Google Scholar 

  11. Mahmoud, A., Zrigui, M.: BLSTM-API: Bi-LSTM recurrent neural network-based approach for Arabic paraphrase identification. Arab. J. Sci. Eng. 46, 4163–4174 (2021)

    Article  Google Scholar 

  12. Mallat, S., Zouaghi, A., Hkiri, E., Zrigui, M.: Method of lexical enrichment in information retrieval system in Arabic. Int. J. Inform. Retrieval Res. (IJIRR) 3(4), 35–51 (2013)

    Article  Google Scholar 

  13. Mansouri, S., Charhad, M., Zrigui, M.: A heuristic approach to detect and localize text in Arabic news video. Computación y Sistemas 22(1), 75–82 (2018)

    Article  Google Scholar 

  14. Mdhaffar, S., Bougares, F., Esteve, Y., Hadrich-Belguith, L.: Sentiment analysis of tunisian dialects: linguistic resources and experiments. In: Third Arabic Natural Language Processing Workshop (WANLP), pp. 55–61 (2017)

    Google Scholar 

  15. Merhbene, L., Zouaghi, A., Zrigui, M.: Ambiguous Arabic words disambiguation. In: 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, pp. 157–164. IEEE (2010)

    Google Scholar 

  16. Nguyen, D.Q., Vu, T., Nguyen, A.T.: BERTweet: a pre-trained language model for English tweets. arXiv preprint arXiv:2005.10200 (2020)

  17. Polignano, M., Basile, P., De Gemmis, M., Semeraro, G., Basile, V., et al.: Alberto: Italian BERT language understanding model for nlp challenging tasks based on tweets. In: CEUR Workshop Proceedings, vol. 2481, pp. 1–6. CEUR (2019)

    Google Scholar 

  18. Rong, X.: word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014)

  19. Sghaier, M.A., Zrigui, M.: Sentiment analysis for Arabic e-commerce websites. In: 2016 International Conference on Engineering & MIS (ICEMIS), pp. 1–7. IEEE (2016)

    Google Scholar 

  20. Sghaier, M.A., Zrigui, M.: Rule-based machine translation from Tunisian dialect to modern standard Arabic. Procedia Comput. Sci. 176, 310–319 (2020)

    Article  Google Scholar 

  21. Terbeh, N., Labidi, M., Zrigui, M.: Automatic speech correction: a step to speech recognition for people with disabilities. In: Fourth International Conference on Information and Communication Technology and Accessibility (ICTA), pp. 1–6. IEEE (2013)

    Google Scholar 

  22. Zouaghi, A., Zrigui, M., Antoniadis, G.: Automatic understanding of spontaneous Arabic speech - a numerical model. Trait. Autom. des Langues 49(1), 141–166 (2008). http://www.atala.org/IMG/pdf/TAL-2008-49-1-07-Zouaghi.pdf

  23. Zrigui, S., Ayadi, R., Zouaghi, A., Zrigui, S.: ISAO: an intelligent system of opinions analysis. Res. Comput. Sci. 110, 21–30 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Samawewl Jaballi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jaballi, S., Zrigui, S., Hazar, M.J., Nicolas, H., Zrigui, M. (2024). Exploring Unsupervised Word Representations Models and Neural Networks for Informal Multilingual Text Against Covid-19 Social Media Content. In: Nguyen, N.T., et al. Intelligent Information and Database Systems. ACIIDS 2024. Lecture Notes in Computer Science(), vol 14796. Springer, Singapore. https://doi.org/10.1007/978-981-97-4985-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-4985-0_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-4984-3

  • Online ISBN: 978-981-97-4985-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics