[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3404512.3404526acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbdeConference Proceedingsconference-collections
research-article

A Framework for Arabic Tweets Multi-label Classification Using Word Embedding and Neural Networks Algorithms

Published: 05 July 2020 Publication History

Abstract

The need for classifying tweets is essential for many people like tourists, tourism companies and governments. In this paper, we propose a framework for Arabic Tweets multi-label classification using word embedding technique and deep leering algorithms. We built our dataset using 160k Arabic tweets gathered from Twitter. We compared two deep learning methods, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Our results show that it is possible to classify tweets using our methodology without any significant difference in results of accuracy scores and hamming loss for both types of networks. The accuracy scores and hamming loss were nearly 90% and 0.02, respectively.

References

[1]
N. Panagiotou, I. Katakis, and D. Gunopulos, "Detecting events in online social networks: Definitions, trends and challenges," Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9580, pp. 42--84, 2016.
[2]
K. Kowsari, K. J. Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, "Text classification algorithms: A survey," Inf., vol. 10, no. 4, pp. 1--68, 2019.
[3]
M. Ikonomakis, S. Kotsiantis, and V. Tampakas, "Text classification using machine learning techniques," WSEAS Trans. Comput., vol. 4, no. 8, pp. 966--974, 2005.
[4]
E. Buabin, "Boosted Hybrid Recurrent Neural Classifier for Text Document Classification on the Reuters News Text Corpus," Int. J. Mach. Learn. Comput., vol. 2, no. 5, pp. 588--592, 2012.
[5]
L. L. Maceda, J. L. Llovido, and T. D. Palaoag, "Corpus analysis of earthquake related tweets through topic modelling," Int. J. Mach. Learn. Comput., vol. 7, no. 6, pp. 194--197, 2017.
[6]
A. M. F. Al-Sbou, "A Survey of Arabic Text Classification Models," Int. J. Electr. Comput. Eng., vol. 8, no. 6, p. 4352, 2018. pp 4352--4355.
[7]
N. Albalooshi, N. Mohamed, and J. Al-Jaroodi, "The challenges of Arabic language use on the internet," 2011 Int. Conf. Internet Technol. Secur. Trans. ICITST 2011, no. January, pp. 378--382, 2011.
[8]
R. Lebret and R. Collobert, "Word embeddings through Hellinger PCA," 14th Conf. Eur. Chapter Assoc. Comput. Linguist. 2014, EACL 2014, pp. 482--490, 2014.
[9]
A. B. Soliman, K. Eissa, and S. R. El-Beltagy, "AraVec: A set of Arabic Word Embedding Models for use in Arabic NLP," Procedia Comput. Sci., vol. 117, no. November, pp. 256--265, 2017.
[10]
I. Dilrukshi and K. de Zoysa, "A Feature Selection Method for Twitter News Classification," Int. J. Mach. Learn. Comput., vol. 4, no. 4, pp. 365--370, 2014.
[11]
H. J. Parashar, S. Vijendra, and N. Vasudeva, "An Efficient Classification Approach for Data Mining," Int. J. Mach. Learn. Comput., vol. 2, no. 4, pp. 446--448, 2012.
[12]
S. Compion, P. Croft, J. J. Li, K. I. Ngoy, and F. Qi, "Tweet semantic classification in civic engagement research," Int. J. Mach. Learn. Comput., vol. 8, no. 6, pp. 595--599, 2018.
[13]
I. Hmeidi, M. Al-Ayyoub, N. A. Mahyoub, and M. A. Shehab, "A lexicon based approach for classifying Arabic multi-labeled text," Int. J. Web Inf. Syst., vol. 12, no. 4, pp. 504--532, 2016.
[14]
L. Al Qadi, H. El Rifai, S. Obaid, and A. Elnagar, "Arabic text classification of news articles using classical supervised classifiers," 2019 2nd Int. Conf. New Trends Comput. Sci. ICTCS 2019 - Proc., pp. 1--6, 2019.
[15]
R. Al-Shalabi and R. Obeidat, "Improving KNN Arabic Text Classification with N-Grams Based Document Indexing," Proc. Sixth ..., pp. 108--112, 2008.
[16]
G. Raho, R. Al-Shalabi, G. Kanaan, and A. Nassar, "Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study," Int. J. Adv. Comput. Sci. Appl., vol. 6, no. 2, 2015.
[17]
A. M. d. A. Mesleh, "Chi square feature extraction based svms arabic text categorization system," ICSOFT 2007 - 2nd Int. Conf. Softw. Data Technol. Proc., vol. PL, no. DPS/KE//-, pp. 235--240, 2007.
[18]
E. Alaa, "a Comparative Study on Arabic Text Classification," Egypt. Comput. Sci. J. 20(2), no. May, 2008.
[19]
S. Boukil, M. Biniz, F. El Adnani, L. Cherrat, and A. E. El Moutaouakkil, "Arabic text classification using deep learning technics," Int. J. Grid Distrib. Comput., vol. 11, no. 9, pp. 103--114, 2018.
[20]
Statista, "Number of monthly active Twitter users worldwide," 2020. [Online]. Available: https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users/.
[21]
keras.io, "Text Preprocessing using keras." [Online]. Available: https://keras.io/preprocessing/text/.
[22]
J. Brownlee, "How to Develop a Multichannel CNN Model for Text Classification," 2018.
[23]
P. Verma and B. Khandelwal, "Word embeddings and its application in deep learning," Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 11, pp. 337--341, 2019.
[24]
A. Dhingra, "How Does Word2Vec Work ? Word 2 Vec," no. November, 2017.
[25]
Stanford.edu, "GloVe: Global Vectors for Word Representation." [Online]. Available: https://nlp.stanford.edu/projects/glove/.
[26]
Y. Kim, "Convolutional neural networks for sentence classification," EMNLP 2014 - 2014 Conf. Empir. Methods Nat. Lang. Process. Proc. Conf., pp. 1746--1751, 2014.
[27]
wikipedia, "Sigmoid function.".
[28]
S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Comput., vol. 9, no. 8, pp. 1735--1780, 1997.
[29]
Scikit-learn, "scikit-learn," 2019. [Online]. Available: https://scikit-learn.org/stable/.
[30]
F. Sebastiani, "Machine Learning in Automated Text Categorization," ACM Comput. Surv., vol. 34, no. 1, pp. 1--47, 2002.
[31]
J. Read, B. Pfahringer, G. Holmes, and E. Frank, "Classifier chains for multi-label classification," Mach. Learn., vol. 85, no. 3, pp. 333--359, 2011.

Cited By

View all
  • (2024)Evaluating The Impact of Feature Extraction Techniques on Arabic Reviews ClassificationInfoTech Spectrum: Iraqi Journal of Data Science10.51173/ijds.v1i1.10(42-54)Online publication date: 1-Jun-2024
  • (2024)A Deep Learning-based Classification Model for Arabic News Tweets Using Bidirectional Long Short-Term Memory NetworksPertanika Journal of Science and Technology10.47836/pjst.32.4.0932:4(1609-1628)Online publication date: 16-Jul-2024
  • (2024)Exploring State-of-the-Art Models in Arabic NLP: Insights into Multi-Label Text Classification2024 6th Novel Intelligent and Leading Emerging Sciences Conference (NILES)10.1109/NILES63360.2024.10753176(533-538)Online publication date: 19-Oct-2024
  • Show More Cited By

Index Terms

  1. A Framework for Arabic Tweets Multi-label Classification Using Word Embedding and Neural Networks Algorithms

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    BDE '20: Proceedings of the 2020 2nd International Conference on Big Data Engineering
    May 2020
    146 pages
    ISBN:9781450377225
    DOI:10.1145/3404512
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 July 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Convolutional Neural Networks (CNN)
    2. Multi-label Classification
    3. Python
    4. Recurrent Neural Networks (RNN)
    5. Word Embedding

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    BDE 2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 15 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Evaluating The Impact of Feature Extraction Techniques on Arabic Reviews ClassificationInfoTech Spectrum: Iraqi Journal of Data Science10.51173/ijds.v1i1.10(42-54)Online publication date: 1-Jun-2024
    • (2024)A Deep Learning-based Classification Model for Arabic News Tweets Using Bidirectional Long Short-Term Memory NetworksPertanika Journal of Science and Technology10.47836/pjst.32.4.0932:4(1609-1628)Online publication date: 16-Jul-2024
    • (2024)Exploring State-of-the-Art Models in Arabic NLP: Insights into Multi-Label Text Classification2024 6th Novel Intelligent and Leading Emerging Sciences Conference (NILES)10.1109/NILES63360.2024.10753176(533-538)Online publication date: 19-Oct-2024
    • (2024)DeBERTa-BiLSTM: A multi-label classification model of Arabic medical questions using pre-trained models and deep learningComputers in Biology and Medicine10.1016/j.compbiomed.2024.107921170(107921)Online publication date: Mar-2024
    • (2023)Aspect-Based Sentiment Analysis for Arabic Food Delivery ReviewsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/360514622:7(1-18)Online publication date: 20-Jul-2023
    • (2022)Arabic Rumor Detection Using Contextual Deep Bidirectional Language ModelingIEEE Access10.1109/ACCESS.2022.321752210(114907-114918)Online publication date: 2022
    • (2021)Overview of the Mowjaz Multi-Topic Labelling Task2021 12th International Conference on Information and Communication Systems (ICICS)10.1109/ICICS52457.2021.9464604(502-508)Online publication date: 24-May-2021
    • (2021)A BERT-based system for multi-topic labeling of Arabic content2021 12th International Conference on Information and Communication Systems (ICICS)10.1109/ICICS52457.2021.9464540(486-489)Online publication date: 24-May-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media