Abstract
Early detection of accidents and rescue are of paramount importance in the reduction of fatalities. Social media data, which has evolved to become an important source of sharing information, plays a great role in building machine learning-based models for classifying posts related to accidents. Since the context of the word “accident” is difficult to determine in a posting, various works in literature have developed better classifiers for predicting whether the posting is actually related to an accident. However, an ensemble of classifiers are known to provide better performance than the basic models. Therefore, in this direction, we present a novel weighted majority voting-based ensemble approach for context classification of tweets (WM-ECCT) to detect whether the tweets are related or unrelated to road accidents. For the proposed ensemble model, the weighting scheme is based on the principle of false prediction to true prediction ratio. Also, the proposed model uses the multi-inducer technique and bootstrap sampling to reduce misclassification rates. Moreover, we propose a context-aware labeling approach for the annotation of tweets into related and unrelated categories. Experiments conducted reveal that the proposed ensemble model outperforms the different standalone machine learning and ensemble models on various performance measures.
Similar content being viewed by others
Data availability
On reasonable request from the corresponding author. No datasets were generated or analysed during the current study.
Notes
In this work, basic models and Single-ML classifiers are used interchangeably
References
Ali Farman, Ali Amjad, Imran Muhammad, Naqvi Rizwan Ali, Siddiqi Muhammad Hameed, Kwak Kyung-Sup (2021) Traffic accident detection and condition analysis based on social networking data. Accid Anal Prev 151:105973
Alkouz Balsam, Aghbari Zaher Al (2020) Snsjam: Road traffic analysis and prediction by fusing data from multiple social networks. Inf Process Manag 57(1):102139
Alomari Ebtesam, Katib Iyad, Albeshri Aiiad, Yigitcanlar Tan, Mehmood Rashid (2021) Iktishaf+: a big data tool with automatic labeling for road traffic social sensing and event detection using distributed machine learning. Sensors 21(9):2993
Azhar Anique, Rubab Saddaf, Khan Malik M, Bangash Yawar Abbas, Alshehri Mohammad Dahman, Illahi Fizza, Bashir Ali Kashif (2023) Detection and prediction of traffic accidents using deep learning techniques. Clust Comput 26(1):477–493
Babbar Sarthak, Bedi Jatin (2023) Real-time traffic, accident, and potholes detection by deep learning techniques: a modern approach for traffic management. Neural Comput Appl 35(26):19465–19479
Bhoi Ashutosh, Balabantaray Rakesh Chandra, Sahoo Deepak, Dhiman Gaurav, Khare Manish, Narducci Fabio, Kaur Amandeep (2022) Mining social media text for disaster resource management using a feature selection based on forest optimization. Comput Indus Eng 169:108280
Bojanowski Piotr, Grave Edouard, Joulin Armand, Mikolov Tomas (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Bokaba Tebogo, Doorsamy Wesley, Paul Babu Sena (2022) A comparative study of ensemble models for predicting road traffic congestion. Appl Sci 12(3):1337
Bradley Andrew P (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn 30(7):1145–1159
Breiman Leo (1996) Bagging predictors. Mach Learn 24(2):123–140
Bulbula Kumeda, Fengli Zhang, Fan Zhou, Sadiq Hussain, Ammar Almasri, and Maregu Assefa (2019) Classification of road traffic accident data using machine learning algorithms. In: 2019 IEEE 11th international conference on communication software and networks (ICCSN), pages 682–687. IEEE
Chang Haoliang, Li Lishuai, Huang Jianxiang, Zhang Qingpeng, Chin Kwai-Sang (2022) Tracking traffic congestion and accidents using social media data: a case study of shanghai. Acc Anal Prev 169:106618
Dabiri Sina, Heaslip Kevin (2019) Developing a twitter-based traffic event detection model using deep learning architectures. Expert Syst Appl 118:425–439
D’Andrea Eleonora, Ducange Pietro, Lazzerini Beatrice, Marcelloni Francesco (2015) Real-time detection of traffic from twitter stream analysis. IEEE Trans Intell Transp Syst 16(4):2269–2283
Das Rahul Deb, Purves Ross S (2019) Exploring the potential of twitter to understand traffic events and their locations in greater mumbai, india. IEEE Trans Intell Transp Syst 21(12):5213–5222
Demšar Janez (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dogan Alican, Birant Derya (2019) A weighted majority voting ensemble approach for classification. In: 2019 4th International Conference on Computer Science and Engineering (UBMK), pages 1–6. IEEE,
Duc-Nghia Vu, Dao Nhu-Ngoc, Na Woongsoo, Cho Sungrae (2020) Dynamic resource orchestration for service capability maximization in fog-enabled connected vehicle networks. IEEE Trans Cloud Comput 10(3):1726–1737
González Sergio, García Salvador, Del Ser Javier, Rokach Lior, Herrera Francisco (2020) A practical tutorial on bagging and boosting based ensembles for machine learning: Algorithms, software tools, performance study, practical perspectives and opportunities. Inf Fus 64:205–237
Gu Yiming, Qian Zhen Sean, Chen Feng (2016) From twitter to detector: real-time traffic incident detection using social media data. Trans Res Part C: Emerg Technol 67:321–342
Gutierrez-Osorio Camilo, González Fabio A, Pedraza Cesar Augusto (2022) Deep learning ensemble model for the prediction of traffic accidents using social media data. Computers 11(9):126
Hoang Nguyen, Wei Liu, Paul Rivera, Fang Chen (2016) Trafficwatch: Real-time traffic incident detection and monitoring using social media. In: Advances in Knowledge Discovery and Data Mining: 20th Pacific-Asia Conference, PAKDD 2016, Auckland, New Zealand, April 19-22, 2016, Proceedings, Part I 20, pages 540–551. Springer
Jonnalagadda Jahnavi, Hashemi Mahdi (2021) A deep learning-based traffic event detection from social media. In: 2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Sci (IRI), pages 1–8. IEEE
Joshi Rakesh Chandra, Mishra Rashmi, Gandhi Puneet, Pathak Vinay Kumar, Burget Radim, Dutta Malay Kishore (2021) Ensemble based machine learning approach for prediction of glioma and multi-grade classification. Comput Biol Med 137:104829
Karlos Stamatis, Kostopoulos Georgios, Kotsiantis Sotiris (2020) A soft-voting ensemble based co-training scheme using static selection for binary classification problems. Algorithms 13(1):26
Kokkinos Konstantinos, Nathanail Eftihia (2020) Exploring an ensemble of textual machine learning methodologies for traffic event detection and classification. Trans Telecommun J 21(4):285–294
Leon Florin, Floria Sabina-Adriana, Bădică Costin (2017) Evaluating the effect of voting methods on ensemble-based classification. In: 2017 IEEE international conference on INnovations in intelligent Systems and applications (INISTA), pages 1–6. IEEE
Lipton Zachary C, Elkan Charles, Naryanaswamy Balakrishnan (2014) Optimal thresholding of classifiers to maximize f1 measure. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part II 14, pages 225–239. Springer
Liu Lucia, Guevara Ameth, Sanchez-Galan Javier E (2022) Identification and classification of road traffic incidents in panama city through the analysis of a social media stream and machine learning. Intell Syst Appl 16:200158
Livieris Ioannis E, Kanavos Andreas, Tampakas Vassilis, Pintelas Panagiotis (2019) A weighted voting ensemble self-labeled algorithm for the detection of lung abnormalities from x-rays. Algorithms 12(3):64
Madichetty Sreenivasulu, Sridevi M (2020) Improved classification of crisis-related data on twitter using contextual representations. Procedia Comput Sci 167:962–968
Mikolov Tomas, Grave Edouard, Bojanowski Piotr, Puhrsch Christian, Joulin Armand (2017) Advances in pre-training distributed word representations. arXiv preprint arXiv:1712.09405
Mohan Patro V, Patra Manas Ranjan (2014) Augmenting weighted average with confusion matrix to enhance classification accuracy. Trans Mach Learn Artif Intell 2(4):77–91
Noivirt-Brik Orly, Prilusky Jaime, Sussman Joel L (2009) Assessment of disorder predictions in casp8. Proteins: Struct, Funct, Bioinf 77(S9):210–216
Onorati Teresa, Díaz Paloma, Carrion Belen (2019) From social networks to emergency operation centers: A semantic visualization approach. Futur Gener Comput Syst 95:829–840
Opitz David, Maclin Richard (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198
Osamor Victor Chukwudi, Okezie Adaugo Fiona (2021) Enhancing the weighted voting ensemble algorithm for tuberculosis predictive diagnosis. Sci Rep 11(1):14806
Panda Sanjaya K, Jana Prasanta K (2016) Uncertainty-based qos min-min algorithm for heterogeneous multi-cloud environment. Arab J Sci Eng 41(8):3003–3025
Polikar Robi (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6(3):21–45
Prashanth KVTKN, Tene Ramakrishnudu (2023) Semi-supervised approach for tweet-level stress detection. Natl Language Process J 100019
Raul Sanjib Kumar, Rout Rashmi Ranjan, Somayajulu DVLN (2023) Topic classification using regularized variable-size cnn and dynamic bpso in online social network. Arab J Sci Eng 1–23
Raul Sanjib Kumar, Rout Rashmi Ranjan, Somayajulu DVLN (2024) Weighted ensemble learning for accident severity classification using social media data. SN Comput Sci 5(5):528
Rezapour Mahdi, Molan Amirarsalan Mehrara, Ksaibati Khaled (2020) Analyzing injury severity of motorcycle at-fault crashes using machine learning techniques, decision tree and logistic regression models. Int J Trans Sci Technol 9(2):89–99
Road-Traffic-Injuries, (2022)
Rokach L (2010) Pattern classification using ensemble learning. Series in machine perception and artificial intelligence, 75
Salas Angelica, Georgakis Panagiotis, Petalas Yannis (2017) Incident detection using data from social media. In: 2017 IEEE 20th International conference on intelligent transportation systems (ITSC), pages 751–755. IEEE
Santos Daniel, Saias José, Quaresma Paulo, Nogueira Vítor Beires (2021) Machine learning approaches to traffic accident analysis and hotspot prediction. Computers 10(12):157
Sharma Umamaheswara, Sadam Ravichandra (2023) How far does the predictive decision impact the software project? the cost, service time, and failure analysis from a cross-project defect prediction model. J Syst Softw 195:111522
Sinadabiri/Tweet-Classification, (2019)
Suat-Rojas Nestor, Gutierrez-Osorio Camilo, Pedraza Cesar (2022) Extraction and analysis of social networks data to detect traffic accidents. Information 13(1):26
Taghipour Homa, Parsa Amir Bahador, Chauhan Rishabh Singh, Derrible Sybil, Mohammadian Abolfazl Kouros (2022) A novel deep ensemble based approach to detect crashes using sequential traffic data. IATSS Res 46(1):122–129
Thanedar Md Asif, Panda Sanjaya Kumar (2023) A dynamic resource management algorithm for maximizing service capability in fog-empowered vehicular ad-hoc networks. Peer-to-Peer Networking and Applications, pages 1–15
Vaiyapuri Thavavel, Gupta Meenu (2021) Traffic accident severity prediction and cognitive analysis using deep learning. Soft Comput 1–13
Vallejos Sebastián, Alonso Diego G, Caimmi Brian, Berdun Luis, Armentano Marcelo G, Soria Álvaro (2021) Mining social networks to detect traffic incidents. Inf Syst Front 23(1):115–134
Vemireddy Satish, Rout Rashmi Ranjan (2021) Fuzzy reinforcement learning for energy efficient task offloading in vehicular fog computing. Comput Netw 199:108463
Witanto Joseph Nathanael, Lim Hyotaek, Atiquzzaman Mohammed (2018) Smart government framework with geo-crowdsourcing and social media analysis. Futur Gener Comput Syst 89:1–9
Yigitcanlar Tan, Regona Massimo, Kankanamge Nayomi, Mehmood Rashid, D’Costa Justin, Lindsay Samuel, Nelson Scott, Brhane Adiam (2022) Detecting natural hazard-related disaster impacts with social media analytics: the case of australian states and territories. Sustainability 14(2):810
Zhang Zhenhua, He Qing, Gao Jing, Ni Ming (2018) A deep learning approach for detecting traffic accidents from social media data. Trans Res Part C: Emerg Technol 86:580–596
Acknowledgements
Not Applicable.
Author information
Authors and Affiliations
Contributions
Sanjib Kumar Raul: Conception, Design, Proposed Model, Model Analysis, Drafting, Data collection, Experimentation, Review, and Approve. Rashmi Ranjan Rout: Conception, Design, Model Analysis, Drafting, Review, and Approve. D.V.L.N. Somayajulu: Conception, Design, Model Analysis, Drafting, Review, and Approve.
Corresponding author
Ethics declarations
Conflict of interest
The authors affirm that they have no known conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Raul, S.K., Rout, R.R. & Somayajulu, D.V.L.N. A novel weighted majority voting-based ensemble approach for detection of road accidents using social media data. Soc. Netw. Anal. Min. 14, 214 (2024). https://doi.org/10.1007/s13278-024-01368-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-024-01368-w