Abstract
Rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. In social networks, false-rumors may have significantly different contextual characteristics from true-rumors at lexical, syntactic, semantic levels. Therefore, this study presents the BERT-SAWS semi-supervised learning model for early verification of Persian rumor by investigating content-based and context features at three views: Contextual Word Embeddings (CWE), speech act, and Writing Style (WS). This model is built by loading pre-trained Bidirectional Encoder Representations from Transformers (BERT) as an unsupervised language representation, fine-tuning it using a small Persian rumor dataset, and combining with a supervised learning model to provide an enriched text representation of the content of the rumor. This text representation enables the model to have a better comprehending of the rumor language to verify rumors better than baseline models for two reasons: (i) early rumor verification by focusing on content-based and context-based features of the source rumor. (ii) overcoming the problem of the shortcoming of the dataset in deep neural networks by loading pre-trained BERT, fine-tuning it using the Persian rumor dataset, and combining with speech act and WS-based features. The empirical results of applying the model on Twitter and Telegram datasets demonstrated that BERT-SAWS can enhance the performance of the classifier from 2% to 18%. It indicates that speech act and WS alongside semantic contextual vectors are helpful features in the rumor verification task.
Similar content being viewed by others
Notes
Part of Speech
It is a social media. https://Telegram.org
References
Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering, Springer
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73:4773–4795
Afrooz GA (2011) Ravanshenasy-e shayee [ psychological bases of rumor]. Islamic Culture Publishing Office, Tehran
Ahmed H, Traore I, Saad S (2018) Detecting opinion spams and fake news using text classification. Secur Priv 1:e9
Allport GW, Postman L (1947) The psychology of rumor. Russell and Russell, New York
Alomari E, Mehmood R, Katib I (2020) Sentiment analysis of Arabic tweets for road traffic congestion and event detection, in: smart Infrastruct Appl, Springer: pp. 37–54
Arts M (2008) Automatic detection and verification of rumors on twitter by Soroush Vosoughi, Massachusetts Inst Technol https://dspace.mit.edu/handle/1721.1/98553 (accessed December 29, 2019)
Bijad A (2018) Ravanshenasi-e shayee [rumor psychology], Vania
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146. https://doi.org/10.1162/tacl_a_00051
Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter, in: Proc 20th Int Conf. World wide web - WWW ‘11, ACM Press, New York : p. 675. https://doi.org/10.1145/1963405.1963500.
Chen T, Li X, Yin H, Zhang J (2018) Call attention to rumors: deep attention based recurrent neural networks for early rumor detection, in: Pacific-Asia Conf Knowl Discov Data Min, Springer : pp. 40–52. https://doi.org/10.1007/978-3-030-04503-6_4
Chen Y-C, Liu Z-Y, Kao H-Y (2017) IKM at SemEval-2017 Task 8: Convolutional Neural Networks for stance detection and rumor verification, in: Proc. 11th Int. Work. Semant. Eval. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 465–469. https://doi.org/10.18653/v1/S17-2081
Chua AYK, Banerjee S (2016) Linguistic predictors of rumor veracity on the Internet, 387--391. http://www.iaeng.org/IMECS2016/.
Chua AYK, Banerjee S (2016) Linguistic predictors of rumor veracity on the internet, in: Lect Notes Eng Comput Sci, Newswood Limited : pp. 387–391
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding, NAACL-HLT. 4171–4186. https://github.com/tensorflow/tensor2tensor (accessed December 29, 2019)
Feizi Derakhshi AR, Feizi Derakhshi MR, Ranjbar Khadivi M, Khasmakhi NN, Ramezani M, Farshi TR, Moattar EZ, Asgari-Changhlu M, Bakhsh ZJ (2019), Sepehr_RumTel01, Mendeley Data, V1. https://doi.org/10.17632/JW3ZWF8RDP.1
Geng Y, Lin Z, Fu P, Wang W (2019) Rumor Detection on Social Media: A Multi-view Model Using Self-attention Mechanism, in: Lect Notes Comput Sci (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) : pp. 339–352 https://doi.org/10.1007/978-3-030-22734-0_25.
Grifoni P, Caschera MC, Ferri F (2020) DAMA: a dynamic classification of multimodal ambiguities. Int J Comput Intell Syst 13:178–192
Guo H, Cao J, Zhang Y, Guo J, Li J (2018) Ru-mor detection with hierarchical social attention network. Int Conf Inf Knowl Manag Proc:943–952. https://doi.org/10.1145/3269206.3271709
Hamidian S, Diab M (2016) Rumor Identification and Belief Investigation on Twitter, in: Proc. 7th Work. Comput. Approaches to Subj. Sentim. Soc. Media Anal. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 3–8. https://doi.org/10.18653/v1/W16-0403
Homayounpour M, Soltani P (2010) Speech Acts Classification of Persian Language Texts Using Three Machine Learning Methods. https://www.sid.ir/en/Journal/ViewPaper.aspx?ID=208218 (accessed December 29, 2019).
Jahanbakhsh-Nagadeh Z, Feizi-Derakhshi MR, Ramezani M, Rahkar-Farshi T, Asgari-Chenaghlu M, Nikzad-Khasmakhi N, Feizi-Derakhshi AR, Ranjbar-Khadivi M, Zafarani-Moattar E, Balafar MA (2020) A Model to Measure the Spread Power of Rumors, ArXiv Prepr. ArXiv2002.07563
Jahanbakhsh-Nagadeh Z, Feizi-Derakhshi MR, Sharifi A (2020) A Speech Act Classifier for Persian Texts and its Application in Identifying Rumors, J Soft Comput Inf Technol (JSCIT) Vol. 9
Jin Z, Cao J, Guo H, Zhang Y, Luo J (2017) Multimodal fusion with recurrent neural networks for rumor detection on microblogs, in: Proc 25th ACM Int Conf Multimed pp 795–816
Kashefi O, Nasri M, Kanani K (2010) Automatic Spell Checking in Persian Language, Supreme Counc. Inf. Commun. Technol. (SCICT), Tehran, Iran
Kim Y (2014) Convolutional neural networks for sentence classification, in: Proc 42nd Annu Meet Assoc Comput Linguist Assoc Comput Linguist, : pp. 1746–1751
Kumar S, Asthana R, Upadhyay S, Upreti N, Akbar M (2019) Fake news detection using deep learning models: A novel approach, Trans Emerg Telecommun Technol 31(2):e3767
Kumar A, Sangwan SR, Nayyar A (2019) Rumour veracity detection on twitter using particle swarm optimized shallow classifiers. Multimed Tools Appl 78:24083–24101
Kwon S, Cha M, Jung K, Chen W, Wang Y (2013) Prominent Features of Rumor Propagation in Online Social Media, in: 2013 IEEE 13th Int Conf Data min, IEEE : pp. 1103–1108. https://doi.org/10.1109/ICDM.2013.61.
Li L, Cai G, Chen N (2018) A Rumor Events Detection Method Based on Deep Bidirectional GRU Neural Network, in: 2018 3rd IEEE Int Conf Image, Vis Comput ICIVC 2018 : pp. 755–759 https://doi.org/10.1109/ICIVC.2018.8492819.
Lin X, Liao X, Xu T, Pian W, Wong KF (2019) Rumor detection with hierarchical recurrent convolutional neural network, in: CCF Int. Conf. Nat. Lang. Process. Chinese Comput., Springer : pp. 338–348
Lin D, Lv Y, Cao D (2015) Rumor diffusion purpose analysis from social attribute to social content, in: 2015 Int. Conf. Asian Lang. Process., IEEE : pp. 107–110. https://doi.org/10.1109/IALP.2015.7451543.
Liu Y, Wu YFB (2018) Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks, in: thirty-second AAAI Conf. Artif Intell
Ma J, Gao W, Mitra P, Kwon S, Jansen BJ, Wong KF, Cha M (2016) Detecting rumors from microblogs with recurrent neural networks. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence pp 3818–3824
Ma B, Lin D, Cao D (2017) Content representation for microblog rumor detection. In: Adv Intell Syst Comput, Springer Verlag, Cham pp 245–251. https://doi.org/10.1007/978-3-319-46562-3_16
Mahmoodabad S, Farzi S, Bakhtiarvand DB (2018) Persian Rumor detection on twitter. In 2018 9th International Symposium on Telecommunications (IST). IEEE, Tehran, Iran, pp 597–602
Minaee S, Kalchbrenner N, Cambria E, Nikzad N, Chenaghlu M, Gao J (2020) Deep learning based text classification: A comprehensive review, ArXiv Prepr. ArXiv2004.03705
Mohammad SM, Turney PD (2013) Crowdsourcing a word-emotion association lexicon, in: Comput Intell : pp. 436–465. https://doi.org/10.1111/j.1467-8640.2012.00460.x
Mohammadi H, Khasteh SH (2018) A Machine Learning Approach to Persian Text Readability Assessment Using a Crowdsourced Dataset, . http://arxiv.org/abs/1810.06639 (accessed December 29, 2019).
Poddar L, Hsu W, Lee ML, Subramaniyam S (2018) Predicting stances in Twitter conversations for detecting veracity of rumors: A neural approach, in: 2018 IEEE 30th Int. Conf. Tools with Artif. Intell., IEEE : pp. 65–72
Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15:1119–1125
Qazvinian V, Rosengren E, Radev D, Mei Q (2011) Rumor has it: Identifying misinformation in microblogs. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp1589–1599
Searle JR (1979) A taxonomy of illocutionary acts, in: Expr. Mean., Cambridge University Press, : pp. 1–29. https://doi.org/10.1017/CBO9780511609213.003
Shamsfard M, Kiani S, Shahedi Y (2009) STeP-1: standard text preparation for Persian language, in: Third Work Comput Approaches to Arab Script-Based Lang : pp. 859–865
Shu X, Tang J, Qi GJ, Song Y, Li Z, Zhang L (2017) Concurrence-aware long short-term sub-memories for person-person action recognition, In: Proc IEEE Conf Comput Vis Pattern Recognit Work : pp. 1–8
Shu X, Zhang L, Tang J, Xie GS, Yan S (2016) Computational face reader, In: Int Conf Multimed Model, Springer : pp. 114–126
Singhal S, Shah RR, Chakraborty T, Kumaraguru P, Satoh S (2019) SpotFake: A multi-modal framework for fake news detection, in: Proc. - 2019 IEEE 5th Int. Conf. Multimed. Big data, BigMM 2019 : pp. 39–47. https://doi.org/10.1109/BigMM.2019.00-44.
Undeutsch U (1967) Beurteilung der glaubhaftigkeit von aussagen. Handb Der Psychol 11:26–181
Volkova S, Shaffer K, Jang JY, Hodas N (2017) Separating facts from fiction: Linguistic models to classify suspicious and trusted news posts on twitter. In: Proc 55th Annual Meeting of the Association for Computational Linguistics (Volume 2 Short Pap: pp 647–653
Wang Y, Ma F, Jin Z, Yuan Y, Xun G, Jha K, Su L, Gao J (2018) Eann: Event adversarial neural networks for multi-modal fake news detection. In Proceedings of the 24th acm sigkdd international conference on knowledge discovery and data mining pp 849–857
Wu K, Yang S, Zhu KQ (2015) False rumors detection on Sina Weibo by propagation structures, in: 2015 IEEE 31st Int Conf Data Eng, IEEE : pp. 651–662. https://doi.org/10.1109/ICDE.2015.7113322.
Yang F, Liu Y, Yu X, Yang M (2012) Automatic detection of rumor on Sina Weibo, in: Proc. ACM SIGKDD Work. Min. Data Semant. - MDS ‘12. ACM press, New York, pp 1–7. https://doi.org/10.1145/2350190.2350203
Yong Z, Yao H, Wu Y (2018) Rumors Detection in Sina Weibo Based on Text and User Characteristics, in: 2018 2nd IEEE Adv Inf Manag Autom Control Conf, IEEE : pp. 1380–1386. https://doi.org/10.1109/IMCEC.2018.8469468.
Yu F, Liu Q, Wu S, Wang L, Tan T (2017) A convolutional approach for misinformation identification; In Proceedings of the 26th International Joint Conference on Artificial Intelligence pp 3901–3907
Zamani S, Asadpour M, Moazzami D (2017) Rumor detection for Persian tweets, in: 2017 25th Iran. Conf Electr Eng ICEE 2017:1532–1536. https://doi.org/10.1109/IranianCEE.2017.7985287
Zarharan M, Ahangar S, Rezvaninejad FS, Bidhendi ML, Jalali SS, Eetemadi S, Pilehvar MT, Minaei-Bidgoli B (2019) Persian stance classification dataset. In Proceedings of the Conference for Truth and Trust Online 2019. https://doi.org/10.36370/tto.2019.30
Zhao Z, Resnick P, Mei Q, Minds E (2015) Enquiring minds: Early detection of rumors in social media from enquiry posts. In Proceedings of the 24th international conference on world wide web, pp 1395–1405. https://doi.org/10.1145/2736277.2741637
Zhou L, Burgoon JK, Nunamaker JF, Twitchell D (2004) Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communications. Gr Decis Negot 13:81–106. https://doi.org/10.1023/B:GRUP.0000011944.62889.6f
Zhou X, Jain A, Phoha VV, Zafarani R (2020) Fake news early detection: A theory-driven model. Digit Threat Res Pract 1:1–25
Zhou K, Shu C, Li B, Lau JH (2019) Early Rumour Detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp 1614–1623)
Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R, Liakata M, Procter R (2018) Detection and resolution of rumours in social media: a survey. ACM Comput Surv 51:32–36. https://doi.org/10.1145/3161603
Zubiaga A, Liakata M, Procter R (2017) Exploiting context for rumour detection in social media, in: Int Conf Soc Informatics, Springer : pp. 109–123
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jahanbakhsh-Nagadeh, Z., Feizi-Derakhshi, MR. & Sharifi, A. A semi-supervised model for Persian rumor verification based on content information. Multimed Tools Appl 80, 35267–35295 (2021). https://doi.org/10.1007/s11042-020-10077-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-10077-3