Abstract
Due to population growth and rapid urbanization in Indian cities, transportation has evolved as a critical concern affecting a large number of commuters everyday. Hence it is important for the urban planners, policymakers, and transportation authorities of India to know about the different public grievances/concerns regarding transportation. This study aims to uncover valuable information about specific transport-related complaints/grievances in Indian cities from the vast pool of user-generated content on social media platforms such as Twitter. As an initial step, we have explored the broad sentiment of commuters in six Indian metropolitan cities about the existing transportation systems, and created a dataset that broadly classify tweets into negative and positive sentiments. Next, we have identified a set of fine-grained complaints/grievances in these tweets, and thus created the first dataset containing transport-related tweets labelled into various specific complaints/grievances in a multi-label setting. To our knowledge, there is no existing dataset that labels tweets according to specific concerns raised in the posts. We apply several classification models on the dataset, for classifying transportation-related tweets into the specific complaints/grievances. We further conducted a city-wise analysis to better comprehend the specific transport-related complaints prevalent in each Indian city.
Similar content being viewed by others
Data Availability
After acceptance of the manuscript, Data link will be rovided within the manuscript which is publicly accessible.
Competing interests
The authors declare no competing interests.
Notes
We had intended to collect more data periodically to understand the persistence of grievances, seasonal variations in grievances, etc. However, Twitter changed its data policy and restricted free data collection through its API in early April 2023 (Bell 2023), thus disabling us from collecting more data.
References
Agarwal S, Kumar A, Ganguly R (2024) Investigating transformer-based models for automated e-governance in Indian railway using twitter. Multimedia Tools Appl 83(2):4551–4577
Agrawal A, Kuriakose PN (2022) Implications of a twitter data-centred methodology for assessing commuters’ perceptions of the Delhi metro in India. Comput Urban Sci 2(1):38
Akhtar N, Beg MS (2021) Railway complaint tweets identification. In: Data management, analytics and innovation: proceedings of ICDMAI 2020, vol 1. pp 195–207. Springer, Berlin
Barbieri F, Camacho-Collados J, Neves L, Espinosa-Anke L (2020) Tweeteval: unified benchmark and comparative evaluation for tweet classification. arXiv preprint arXiv:2010.12421
Bell K (2023) Twitter shut off its free API and it’s breaking a lot of apps (Apr), https://www.engadget.com/twitter-shut-off-its-free-api-and-its-breaking-a-lot-of-apps-222011637.html
Congosto M, Fuentes-Lorenzo D, Sánchez L (2015) Microbloggers as sensors for public transport breakdowns. IEEE Internet Comput 19(6):18–25
Das RD (2021) Understanding users’ satisfaction towards public transit system in India: a case-study of Mumbai. ISPRS Int J Geo Inf 10(3):155
Das RD, Purves RS (2019) Exploring the potential of twitter to understand traffic events and their locations in greater Mumbai, India. IEEE Trans Intell Transp Syst 21(12):5213–5222
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
Eboli L, Mazzulla G (2012) Performance indicators for an objective measure of public transport service quality. Inst Study Transp Within Eur Econ Integr 51:1–4
Gaikwad AS (2019) Twitter sentiment analysis approaches: a survey. Int J Emerg Technol Learn 15(15):79
Gal-Tzur A, Grant-Muller SM, Kuflik T, Minkov E, Nocera S, Shoor I (2014) The potential of social media in delivering transport policy goals. Transp Policy 32:115–123
Guven ZA (2021) Comparison of bert models and machine learning methods for sentiment analysis on Turkish tweets. In: 2021 6th international conference on computer science and engineering (UBMK). IEEE. pp 98–101
Hadjidimitriou NS, Lippi M, Mamei M (2023) Explaining population variation after the 2016 central Italy earthquake using call data records and twitter. Soc Netw Anal Min 13(1):140
Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol 8, pp 216–225
Lamsal R, Read MR, Karunasekera S (2024) CrisisTransformers: pre-trained language models and sentence encoders for crisis-related social media texts. Knowl-Based Syst 296(4):111916
Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized Bert pretraining approach. arXiv preprint arXiv:1907.11692
Lock O, Pettit C (2020) Social media as passive geo-participation in transportation planning-how effective are topic modeling & sentiment analysis in comparison with citizen surveys? Geo-spatial Inf Sci 23(4):275–292
Mandloi L, Patel R (2020) Twitter sentiments analysis using machine learning methods. In: 2020 international conference for emerging technology (INCET). IEEE, pp 1–5
Nandwani P, Verma R (2021) A review on sentiment analysis and emotion detection from text. Soc Netw Anal Min 11(1):81
Naw N (2018) Twitter sentiment analysis using support vector machine and k-NN classifiers. IJSRP 8:407–411
Nguyen DQ, Vu T, Nguyen AT (2020) Bertweet: a pre-trained language model for English tweets. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations. pp 9–14
Nikolaidou A, Papaioannou P (2018) Utilizing social media in transport planning and public transit quality: survey of literature. J Transp Eng Part A: Syst 144(4):04018007
Nokkaew M, Nongpong K, Yeophantong T, Ploykitikoon P, Arjharn W, Siritaratiwat A, Narkglom S, Wongsinlatam W, Remsungnen T, Namvong A et al (2023) Analyzing online public opinion on Thailand–China high-speed train and Laos–China railway mega-projects using advanced machine learning for sentiment analysis. Soc Netw Anal Min 14(1):15
Osorio-Arjona J, Horak J, Svoboda R, García-Ruíz Y (2021) Social media semantic perceptions on Madrid Metro system: using Twitter data to link complaints to space. Sustain Cities Soc 64:102530
Parveen H, Pandey S (2016) Sentiment analysis on twitter data-set using Naive Bayes algorithm. In: 2016 2nd international conference on applied and theoretical computing and communication technology (iCATccT). pp 416–419. https://doi.org/10.1109/ICATCCT.2016.7912034
Passonneau R (2006) Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation. In: Proceedings of the international conference on language resources and evaluation (LREC)
Poddar S, Samad AM, Mukherjee R, Ganguly N, Ghosh S (2022) Caves: a dataset to facilitate explainable classification and summarization of concerns towards COVID vaccines. In: Proceedings of the 45th international ACM SIGIR conference on research and development in information retrieval. pp 3154–3164
Qi Y, Shabrina Z (2023) Sentiment analysis using twitter data: a comparative application of lexicon- and machine–learning-based approach. Soc Netw Anal Min 13(1):31
Rita P, António N, Afonso AP (2023) Social media discourse and voting decisions influence: sentiment analysis in tweets during an electoral period. Soc Netw Anal Min 13(1):46
Sreeja I, Sunny JV, Jatian L (2020) Twitter sentiment analysis on airline tweets in India using R language. In: Journal of Physics: conference series. IOP Publishing. vol 1427, p 012003
Tao K, Abel F, Hauff C, Houben G, Gadiraju U (2013) Groundhog day: near-duplicate detection on twitter. In: Proceedings of the world wide web (WWW)
Thelwall M (2015) Evaluating the comprehensiveness of twitter search API results: a four step method
Truelove M, Vasardani M, Winter S (2017) Testing the event witnessing status of micro-bloggers from evidence in their micro-blogs. PLoS ONE. https://doi.org/10.1371/journal.pone.0189378
Vishwakarma A, Chugh M (2023) Covid-19 vaccination perception and outcome: society sentiment analysis on twitter data in India. Soc Netw Anal Min 13(1):1–12
Windasari IP, Uzzi FN, Satoto KI (2017) Sentiment analysis on twitter posts: an analysis of positive or negative opinion on Gojek. In: 2017 4th international conference on information technology, computer, and electrical engineering (ICITACEE). IEEE. pp 266–269
Yaakub MR, Latiffi MIA, Zaabar LS (2019) A review on sentiment analysis techniques and applications. In: IOP conference series: materials science and engineering. IOP Publishing, vol 551, p 012070
Zhou X, Tao X, Yong J, Yang Z (2013) Sentiment analysis on tweets for social events. In: Proceedings of the 2013 IEEE 17th international conference on computer supported cooperative work in design (CSCWD). p. 557–562. IEEE
Author information
Authors and Affiliations
Contributions
Rahul P , A. Das, T. Jaiswal, V. Singh are responsible for data collection, preparation of dataset for model building, implementation of different models. S. Poddar is responsible for some model implementations, S. Ghosh is involved in research problem formulation, analysis of the problem and review of manuscript. M. Basu has contributed in analysis of the research problem, literature survey, city wise summarization of grievances and manuscript writing.
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Pullanikkat, R., Poddar, S., Das, A. et al. Utilizing the Twitter social media to identify transportation-related grievances in Indian cities. Soc. Netw. Anal. Min. 14, 118 (2024). https://doi.org/10.1007/s13278-024-01278-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-024-01278-x