Abstract
Natural disasters, as well as human-made disasters, can have a deep impact on wide geographic areas, and emergency responders can benefit from the early estimation of emergency consequences. This work presents CrisMap, a Big Data crisis mapping system capable of quickly collecting and analyzing social media data. CrisMap extracts potential crisis-related actionable information from tweets by adopting a classification technique based on word embeddings and by exploiting a combination of readily-available semantic annotators to geoparse tweets. The enriched tweets are then visualized in customizable, Web-based dashboards, also leveraging ad-hoc quantitative visualizations like choropleth maps. The maps produced by our system help to estimate the impact of the emergency in its early phases, to identify areas that have been severely struck, and to acquire a greater situational awareness. We extensively benchmark the performance of our system on two Italian natural disasters by validating our maps against authoritative data. Finally, we perform a qualitative case-study on a recent devastating earthquake occurred in Central Italy.
Similar content being viewed by others
Notes
The plugin is publicly available at https://github.com/marghe943/kibanaChoroplethMap.git.
See Section 5 for more details about the proposed approach.
As software implementation we used the SVC class available in the scikit-learn Python package.
The meaning of this hypothesis is that words appearing in similar contexts often have a similar meaning.
We did not use more sophisticated methods like “Paragraph Vector” (Le and Mikolov 2014) because these statistical methods do not work well for small texts like tweets.
We used the ’balanced’ value for class weight, see scikit-learn documentation at http://bit.ly/2g5QSqk. In this way we indicate to SVM to treat the various labels in different ways during the training phase, giving more importance to class errors (measured with used loss function) made for skewed classes.
In case of configurations with equal results in terms of F1 we prefer to choose those having more balanced values between precision and recall measures.
http://www.regione.sardegna.it/documenti/1_231_20140403083152.pdf - Italian Civil Protection report on damage to private properties, public infrastructures, and production facilities.
References
Avvenuti, M. et al. (2014a). EARS (Earthquake Alert and Report System): a real time decision support system for earthquake crisis management. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1749—1758). ACM.
Avvenuti, M. et al. (2014b). Earthquake emergency management by social sensing. In 2014 IEEE International conference on pervasive computing and communications workshops (PERCOM Workshops) (pp. 587–592). IEEE.
Avvenuti, M. et al. (2016a). A framework for detecting unfolding emergencies using humans as sensors. SpringerPlus, 5.1, 43.
Avvenuti, M. et al. (2016b). Impromptu crisis mapping to prioritize emergency response. Computer, 49.5, 28–37.
Avvenuti, M. et al. (2016c). Predictability or early warning: using social media in modern emergency response. IEEE Internet Computing, 20.6, 4–6.
Avvenuti, M. et al. (2017). Hybrid crowdsensing: a novel paradigm to combine the strengths of opportunistic and participatory crowdsensing. In Proceedings of the 26th international conference on World Wide Web companion (pp. 1413–1421). International World Wide Web Conferences Steering Committee.
Bauduy, J. (2010). Mapping a crisis, one text message at a time. Social Education, 74.3, 142–143.
Bengio, Y., Courville, A., Vincent, P. (2013). Representation learning: a review and new perspectives. IEEE Transaction on Pattern Analysis and Machine Intelligence, 35.8, 1798–1828.
Bengio, Y. et al. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.
Burks, L., Miller, M., Zadeh, R. (2014). Rapid estimate of ground shaking intensity by combining simple earthquake characteristics with tweets. In 10th US National conference on earthquake engineering.
Cheng, Z., Caverlee, J., Lee, K. (2010). You are where you tweet: a content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 759–768). ACM.
Cheong, F., & Cheong, C. (2011). Social media data mining: a social network analysis of tweets during the 2010-2011 australian floods. PACIS, 11, 46–46.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20.3, 273–297.
Cresci, S. et al. (2015a). Crisis mapping during natural disasters via text analysis of social media messages. In International conference on Web information systems engineering–WISE 2015 (pp. 250–258). Springer.
Cresci, S. et al. (2015b). A linguistically-driven approach to cross-event damage assessment of natural disasters from social media messages. In Proceedings of the 24th international conference on World Wide Web companion (pp. 1195–1200). International World Wide Web Conferences Steering Committee.
Cresci, S. et al. (2017). Nowcasting of earthquake consequences using big social data. IEEE Internet Computing, 21.6, 37–45.
Dashti, S. et al. (2014). Supporting disaster reconnaissance with social media data: a design-oriented case study of the 2013 Colorado floods. In ISCRAM.
Dewan, P. et al. (2017). Towards understanding crisis events on online social networks through pictures. In Proc. of the IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). ACM.
de Oliveira, M.G. et al. (2015). Producing volunteered geographic information from social media for LBSN improvement. Journal of Information and Data Management, 6.1, 81.
Earle, P.S., Bowden, D. C., Guy, M. (2012). Twitter earthquake detection: earthquake monitoring in a social world. Annals of Geophysics, 54, 6.
Ferragina, P., & Scaiella, U. (2010). Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 1625–1628). ACM.
Gao, H., Barbier, G., Goolsby, R. (2011). Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intelligent Systems, 26.3, 10–14.
Gelernter, J., & Balaji, S. (2013). An algorithm for local geoparsing of microtext. GeoInformatica, 17.4, 635–667.
Gelernter, J., & Mushegian, N. (2011). Geoparsing messages from microtext. Transactions in GIS, 15.6, 753–773.
Goolsby, R. (2010). Social media as crisis platform: the future of community maps/crisis maps. ACM Transactions on Intelligent Systems and Technology (TIST), 1.1, 7.
Gupta, A et al. (2013a). Faking Sandy: characterizing and identifying fake images on twitter during hurricane Sandy. In Proceedings of the 22Nd international conference on World Wide Web. WWW ’13 Companion (pp. 729–736). ACM.
Gupta, A., Lamba, H., Kumaraguru, P. (2013b). $1.00 per RT #BostonMarathon #PrayForBoston: Analyzing fake content on Twitter. In 2013 APWG eCrime researchers summit (pp. 1–12).
Guy, M et al. (2014). Social media based earthquake detection and characterization. In KDD-LESI 2014: Proceedings of the 1st KDD workshop on learning about emergencies from social information at KDD14 (pp. 9–10).
Imran, M et al. (2013). Extracting information nuggets from disaster-related messages in social media. In Proceedings of the 10th international ISCRAM conference (pp. 791–801).
Imran, M et al. (2015). Processing social media messages in mass emergency: a survey. ACM Computing Surveys, 47.4, 67.
Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20.4, 422–446.
Kropivnitskaya, Y. et al. (2017). The predictive relationship between earthquake intensity and tweets rate for real-time ground-motion estimation. In Seismological research letters.
Kryvasheyeu, Y. et al. (2016). Rapid assessment of disaster damage using social media activity. Science Advances, 2.3, e1500779.
Lagerstrom, R et al. (2016). Image classification to support emergency situation awareness. Frontiers in Robotics and AI, 3, 54.
Le, Q.V., & Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31th international conference on machine learning, (ICML 2014) (pp. 1188–1196).
Lewis, G. (2007). Evaluating the use of a low-cost unmanned aerial vehicle platform in acquiring digital imagery for emergency response. In Geomatics solutions for disaster management (pp. 117–133). Springer.
Liang, Y., Caverlee, J., Mander, J. (2013). Text vs. images: on the viability of social media to assess earthquake damage. In Proceedings of the 22nd international conference on World Wide Web companion (pp. 1003–1006). International World Wide Web Conferences Steering Committee.
Meier, P. (2012). Crisis mapping in action: how open source software and global volunteer networks are changing the world, one map at a time. Journal of Map & Geography Libraries, 8.2, 89–100.
Middleton, S. E., Middleton, L., Modafferi, S. (2014). Real-time crisis mapping of natural disasters using social media. IEEE Intelligent Systems, 29.2, 9–17.
Mikolov, T et al. (2013). Distributed representations of words and phrases and their compositionality. In Burges, C. J. C. et al. (Eds.) Advances in neural information processing systems, (Vol. 26 pp. 3111–3119): Curran Associates, Inc.
Pablo, N et al. (2011). DBpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th international conference on semantic systems (pp. 1–8). ACM.
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22.10, 1345–1359.
Sakaki, T., Okazaki, M., Matsuo, Y. (2013). Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Transactions on Knowledge and Data Engineering, 25.4, 919–931.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34.1, 1–47.
Tassiulas, L., & Ephremides, A. (1992). Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks. IEEE Transactions on Automatic Control, 37.12, 1936–1948.
Trani, S. et al. (2014). Dexter 2.0: an open source tool for semantically enriching data. In Proceedings of the 2014 international conference on semantic web (Posters & Demos) (pp. 417–420). Springer.
Usbeck, R. et al. (2015). GERBIL: general entity annotator benchmarking framework. In Proceedings of the 24th international conference on World Wide Web (pp. 1133–1143). ACM.
Verma, S. et al. (2011). Natural language processing to the rescue? Extracting situational awareness tweets during mass emergency. In Proceedings of the 5th international AAAI conference on web and social media (ICWSM). AAAI.
Vieweg, S., & Hodges, A. (2014). Rethinking context: Leveraging human and machine computation in disaster response. Computer, 47.4, 22–27.
Wang, L., & Kant, K. (2014). Special issue on computational sustainability. IEEE Transactions on Emerging Topics in Computing, 2.2, 119–121.
Weber, I., & Garimella, V. R. K. (2014). Visualizing user-defined, discriminative geo-temporal Twitter activity. In ICWSM.
Acknowledgments
This research is supported in part by the EU H2020 Program under the scheme INFRAIA-1-2014-2015: Research Infrastructures grant agreement #654024 SoBigData: Social Mining & Big Data Ecosystem, and by the MIUR (Ministero dell’Istruzione, dell’Universita‘ e della Ricerca) and Regione Toscana (Tuscany, Italy) funding the SmartNews: Social sensing for Breaking News project: PAR-FAS 2007-2013.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Avvenuti, M., Cresci, S., Del Vigna, F. et al. CrisMap: a Big Data Crisis Mapping System Based on Damage Detection and Geoparsing. Inf Syst Front 20, 993–1011 (2018). https://doi.org/10.1007/s10796-018-9833-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10796-018-9833-z