[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

SWS: a complexity-optimized solution for spatial-temporal kernel density visualization

Published: 01 December 2021 Publication History

Abstract

Spatial-temporal kernel density visualization (STKDV) has been extensively used in a wide range of applications, e.g., disease outbreak analysis, traffic accident hotspot detection, and crime hotspot detection. While STKDV can provide accurate and comprehensive data visualization, computing STKDV is time-consuming, which is not scalable to large-scale datasets. To address this issue, we develop a new sliding-window-based solution (SWS), which theoretically reduces the time complexity for generating STKDV, without increasing the space complexity. Moreover, we incorporate SWS with the progressive visualization framework, which can continuously output partial visualization results to users (from coarse to fine), until users satisfy the visualization. Our experimental studies on five large-scale datasets show that SWS achieves 1.71x to 24x speedup compared with the state-of-the-art methods.

References

[1]
[n. d.]. ArcGIS. http://pro.arcgis.com/en/pro-app/tool-reference/spatial-analyst/how-kernel-density-works.htm.
[2]
[n. d.]. CrimeStat: Spatial Statistics Program for the Analysis of Crime Incident Locations. https://nij.ojp.gov/topics/articles/crimestat-spatial-statistics-program-analysis-crime-incident-locations.
[3]
[n. d.]. Hong Kong GeoData Store. https://geodata.gov.hk/gs/view-dataset?uuid=d4ccd9be-3bc0-449b-bd27-9eb9b615f2db&sidx=0.
[4]
[n. d.]. IKCEST: Disaster Risk Reduction. http://drr.ikcest.org/knowledge_service/ncp.html.
[5]
[n. d.]. Los Angeles Open Data. https://data.lacity.org/A-Safe-City/Crime-Data-from-2010-to-2019/63jg-8b9z.
[6]
[n. d.]. NYC Open Data. https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95.
[7]
[n.d.]. NYC Yellow Taxi Trip Data. https://data.cityofnewyork.us/Transportation/2014-Yellow-Taxi-Trip-Data/gkne-dk5s.
[8]
[n. d.]. Ontario Open Data. https://data.ontario.ca/dataset/confirmed-positive-cases-of-covid-19-in-ontario.
[9]
[n. d.]. QGIS. https://docs.qgis.org/2.18/en/docs/user_manual/plugins/plugins_heatmap.html.
[10]
[n. d.]. Seattle Open Data. https://data.seattle.gov/Public-Safety/SPD-Crime-Data-2008-Present/tazs-3rd5.
[11]
Jon Louis Bentley. 1975. Multidimensional Binary Search Trees Used for Associative Searching. Commun. ACM 18, 9 (1975), 509--517.
[12]
Chris Brunsdon, Jonathan Corcoran, and Gary Higgs. 2007. Visualising space and time in crime patterns: A comparison of methods. Comput. Environ. Urban Syst. 31, 1 (2007), 52--75.
[13]
Spencer Chainey, Lisa Tompson, and Sebastian Uhlig. 2008. The Utility of Hotspot Mapping for Predicting Spatial Patterns of Crime. Security Journal 21, 1 (01 Feb 2008), 4--28.
[14]
Tsz Nam Chan, Reynold Cheng, and Man Lung Yiu. 2020. QUAD: Quadratic-Bound-based Kernel Density Visualization. In SIGMOD. 35--50.
[15]
Tsz Nam Chan, Pak Lon Ip, Leong Hou U, Byron Choi, and Jianliang Xu. [n. d.]. SWS: A Complexity-Optimized Solution for Spatial-Temporal Kernel Density Visualization (Technical Report). https://github.com/STKDV/STKDV/blob/main/SWS_STKDV_TR.pdf.
[16]
Tsz Nam Chan, Pak Lon Ip, Leong Hou U, Byron Choi, and Jianliang Xu. 2022. SAFE: A Share-and-Aggregate Bandwidth Exploration Framework for Kernel Density Visualization. Proc. VLDB Endow. 15, 3 (2022), 513--526.
[17]
Tsz Nam Chan, Pak Lon Ip, Leong Hou U, Weng Hou Tong, Shivansh Mittal, Ye Li, and Reynold Cheng. 2021. KDV-Explorer: A Near Real-Time Kernel Density Visualization System for Spatial Analysis. Proc. VLDB Endow. 14, 12 (2021), 2655--2658.
[18]
Tsz Nam Chan, Zhe Li, Leong Hou U, Jianliang Xu, and Reynold Cheng. 2021. Fast Augmentation Algorithms for Network Kernel Density Visualization. Proc. VLDB Endow. 14, 9 (2021), 1503--1516. http://www.vldb.org/pvldb/vol14/p1503-chan.pdf
[19]
Tsz Nam Chan, Leong Hou U, Reynold Cheng, Man Lung Yiu, and Shivansh Mittal. 2020. Efficient Algorithms for Kernel Aggregation Queries. IEEE Transactions on Knowledge and Data Engineering (2020), 1--1.
[20]
Tsz Nam Chan, Leong Hou U, Byron Choi, and Jianliang Xu. 2022. SLAM: Efficient Sweep Line Algorithms for Kernel Density Visualization. In SIGMOD (To appear).
[21]
Tsz Nam Chan, Man Lung Yiu, and Leong Hou U. 2019. KARL: Fast Kernel Aggregation Queries. In ICDE. 542--553.
[22]
Wei Chen, Fangzhou Guo, and Fei-Yue Wang. 2015. A Survey of Traffic Data Visualization. IEEE Trans. Intelligent Transportation Systems 16, 6 (2015), 2970--2984.
[23]
Mark de Berg, Otfried Cheong, Marc J. van Kreveld, and Mark H. Overmars. 2008. Computational geometry: algorithms and applications, 3rd Edition. Springer. https://www.worldcat.org/oclc/227584184
[24]
Eric Delmelle, Coline Dony, Irene Casas, Meijuan Jia, and Wenwu Tang. 2014. Visualizing the impact of space-time uncertainties on dengue fever patterns. International Journal of Geographical Information Science 28, 5 (2014), 1107--1127.
[25]
Edward Gan and Peter Bailis. 2017. Scalable Kernel Density Classification via Threshold-Based Pruning. In ACM SIGMOD. 945--959.
[26]
Thanaa M. Ghanem, Moustafa A. Hammad, Mohamed F. Mokbel, Walid G. Aref, and Ahmed K. Elmagarmid. 2007. Incremental Evaluation of Sliding-Window Queries over Data Streams. IEEE Trans. Knowl. Data Eng. 19, 1 (2007), 57--72.
[27]
A. Gramacki. 2017. Nonparametric Kernel Density Estimation and Its Computational Aspects. Springer International Publishing. https://books.google.com.hk/books?id=PCpEDwAAQBAJ
[28]
Alexander G. Gray and Andrew W. Moore. 2003. Nonparametric Density Estimation: Toward Computational Tractability. In SDM. 203--211.
[29]
Timothy Hart and Paul Zandbergen. 2014. Kernel density estimation and hotspot mapping: examining the influence of interpolation method, grid cell size, and bandwidth on crime forecasting. Policing: An International Journal of Police Strategies and Management 37 (2014), 305--323.
[30]
Alexander Hohl, Eric Delmelle, Wenwu Tang, and Irene Casas. 2016. Accelerating the discovery of space-time patterns of infectious diseases using parallel computing. Spatial and Spatio-temporal Epidemiology 19 (2016), 10 -- 20.
[31]
Yujie Hu, Fahui Wang, Cecile Guin, and Haojie Zhu. 2018. A spatio-temporal kernel density estimation framework for predictive crime hotspot mapping and evaluation. Applied Geography 99 (2018), 89 -- 97.
[32]
Youngok Kang, Nahye Cho, and Serin Son. 2018. Spatiotemporal characteristics of elderly population's traffic accidents in Seoul using space-time cube and space-time kernel density estimation. PLOS ONE 13, 5 (05 2018), 1--17.
[33]
Yan Kestens, Alexandre Lebel, Mark Daniel, Marius Thériault, and Robert Pampalon. 2010. Using experienced activity spaces to measure foodscape exposure. Health & Place 16, 6 (2010), 1094 -- 1103.
[34]
Ove Daae Lampe and Helwig Hauser. 2011. Interactive visualization of streaming data with Kernel Density Estimation. In PacificVis. 171--178.
[35]
Jay Lee, Junfang Gong, and Shengwen Li. 2017. Exploring spatiotemporal clusters based on extended kernel estimation methods. Int. J. Geogr. Inf. Sci. 31, 6 (2017), 1154--1177.
[36]
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. Semantics and Evaluation Techniques for Window Aggregates in Data Streams. In SIGMOD. 311--322.
[37]
Yunxuan Li, Mohamed Abdel-Aty, Jinghui Yuan, Zeyang Cheng, and Jian Lu. 2020. Analyzing traffic violation behavior at urban intersections: A spatio-temporal kernel density estimation approach using automated enforcement system data. Accident Analysis & Prevention 141 (2020), 105509.
[38]
Jonas Lukasczyk, Ross Maciejewski, Christoph Garth, and Hans Hagen. 2015. Understanding hotspots: a topological visual analytics approach. In SIGSPATIAL. 36:1--36:10.
[39]
Andrew W. Moore. 2000. The Anchors Hierarchy: Using the Triangle Inequality to Survive High Dimensional Data. In UAI. 397--405.
[40]
Tomoki Nakaya and Keiji Yano. 2010. Visualising Crime Clusters in a Space-time Cube: An Exploratory Data-analysis Approach Using Spacetime Kernel Density Estimation and Scan Statistics. Transactions in GIS 14, 3 (2010), 223--239. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9671.2010.01194.x
[41]
Yongjoo Park, Michael J. Cafarella, and Barzan Mozafari. 2016. Visualization-aware sampling for very large databases. In ICDE. 755--766.
[42]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake VanderPlas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Edouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[43]
Alexandre Perrot, Romain Bourqui, Nicolas Hanusse, Frédéric Lalanne, and David Auber. 2015. Large Interactive Visualization of Density Functions on Big Data Infrastructure. In LDAV. 99--106.
[44]
Jeff M. Phillips. 2013. ϵ-Samples for Kernels. In SODA. 1622--1632.
[45]
Jeff M. Phillips and Wai Ming Tai. 2018. Improved Coresets for Kernel Density Estimates. In SODA. 2718--2727.
[46]
Jeff M. Phillips and Wai Ming Tai. 2018. Near-Optimal Coresets of Kernel Density Estimates. In SOCG. 66:1--66:13.
[47]
QGIS Development Team. 2009. QGIS Geographic Information System. Open Source Geospatial Foundation. http://qgis.osgeo.org
[48]
Xuedi Qin, Yuyu Luo, Nan Tang, and Guoliang Li. 2020. Making data visualization more efficient and effective: a survey. VLDB J. 29, 1 (2020), 93--117.
[49]
Vikas C. Raykar, Ramani Duraiswami, and Linda H. Zhao. 2010. Fast Computation of Kernel Estimators. Journal of Computational and Graphical Statistics 19, 1 (2010), 205--220.
[50]
H. Samet. 2006. Foundations of Multidimensional and Metric Data Structures.
[51]
Erik Saule, Dinesh Panchananam, Alexander Hohl, Wenwu Tang, and Eric Delmelle. 2017. Parallel Space-Time Kernel Density Estimation. In ICPP. 483--492.
[52]
D.W. Scott. 1992. Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley. https://books.google.com.hk/books?id=7crCUS_F2ocC
[53]
A. Shein and P. K. Chrysanthis. 2020. Multi-Query Optimization of Incrementally Evaluated Sliding-Window Aggregations. IEEE Transactions on Knowledge and Data Engineering (2020), 1--1.
[54]
Kanat Tangwongsan, Martin Hirzel, and Scott Schneider. 2019. Optimal and General Out-of-Order Sliding-Window Aggregation. PVLDB 12, 10 (2019), 1167--1180.
[55]
Kanat Tangwongsan, Martin Hirzel, Scott Schneider, and Kun-Lung Wu. 2015. General Incremental Sliding-Window Aggregation. PVLDB 8, 7 (2015), 702--713.
[56]
Yufei Tao and Dimitris Papadias. 2006. Maintaining Sliding Window Skylines on Data Streams. IEEE Trans. Knowl. Data Eng. 18, 2 (2006), 377--391.
[57]
Alexandru C. Telea. 2014. Data Visualization: Principles and Practice, Second Edition (2nd ed.). A. K. Peters, Ltd., Natick, MA, USA.
[58]
Álvaro Villalba, Josep Lluis Berral, and David Carrera. 2019. Constant-Time Sliding Window Framework with Reduced Memory Footprint and Efficient Bulk Evictions. IEEE Trans. Parallel Distributed Syst. 30, 3 (2019), 486--500.
[59]
Xiang Wang, Ying Zhang, Wenjie Zhang, Xuemin Lin, and Zengfeng Huang. 2016. SKYPE: Top-k Spatial-keyword Publish/Subscribe Over Sliding Window. PVLDB 9, 7 (2016), 588--599.
[60]
Qiujun Wei, Jiangfeng She, Shuhua Zhang, and Jinsong Ma. 2018. Using Individual GPS Trajectories to Explore Foodscape Exposure: A Case Study in Beijing Metropolitan Area. International journal of environmental research and public health 15 (02 2018).
[61]
Kun Xie, Kaan Ozbay, Abdullah Kurkcu, and Hong Yang. 2017. Analysis of Traffic Crashes Involving Pedestrians Using Big Data: Investigation of Contributing Factors and Identification of Hotspots. Risk Analysis 37, 8 (2017), 1459--1476. https://EconPapers.repec.org/RePEc:wly:riskan:v:37:y:2017:i:8:p:1459--1476
[62]
Changjiang Yang, Ramani Duraiswami, and Larry S. Davis. 2004. Efficient Kernel Machines Using the Improved Fast Gauss Transform. In NIPS. 1561--1568. http://papers.nips.cc/paper/2550-efficient-kernel-machines-using-the-improved-fast-gauss-transform
[63]
P. Zezula, G. Amato, V. Dohnal, and M. Batko. 2006. Similarity Search: The Metric Space Approach. Springer US. https://books.google.com.hk/books?id=KTkWXsiPXR4C
[64]
Guiming Zhang, A-Xing Zhu, and Qunying Huang. 2017. A GPU-accelerated adaptive kernel density estimation approach for efficient point pattern analysis on spatial big data. International Journal of Geographical Information Science 31, 10 (2017), 2068--2097.
[65]
Zhijie Zhang, Dongmei Chen, Wenbao Liu, Jeffrey Racine, Seng-Huat Ong, Yue Chen, Genming Zhao, and Qingwu Jiang. 2011. Nonparametric Evaluation of Dynamic Disease Risk: A Spatio-Temporal Kernel Approach. PloS one 6 (03 2011), e17381.
[66]
Xiangyu Zhao and Jiliang Tang. 2018. Crime in Urban Areas: A Data Mining Perspective. SIGKDD Explorations 20, 1 (2018), 1--12.
[67]
Yan Zheng, Jeffrey Jestes, Jeff M. Phillips, and Feifei Li. 2013. Quality and efficiency for kernel density estimates in large data. In SIGMOD. 433--444.
[68]
Yan Zheng, Yi Ou, Alexander Lex, and Jeff M. Phillips. 2017. Visualization of Big Spatial Data using Coresets for Kernel Density Estimates. In IEEE Symposium on Visualization in Data Science (VDS '17), to appear. IEEE.
[69]
Yan Zheng and Jeff M. Phillips. 2015. L∞ Error and Bandwidth Selection for Kernel Density Estimates of Large Data. In SIGKDD. 1533--1542.
[70]
Yirong Zhou, Jun Li, Hao Chen, Ye Wu, Jiangjiang Wu, and Luo Chen. 2020. A spatiotemporal attention mechanism-based model for multi-step citywide passenger demand prediction. Inf. Sci. 513 (2020), 372--385.
[71]
Zhengyi Zhou and David S. Matteson. 2015. Predicting Ambulance Demand: a Spatio-Temporal Kernel Approach. In SIGKDD. 2297--2303.
[72]
Rui Zhu, Bin Wang, Xiaochun Yang, Baihua Zheng, and Guoren Wang. 2017. SAP: Improving Continuous Top-K Queries Over Streaming Data. IEEE Trans. Knowl. Data Eng. 29, 6 (2017), 1310--1328.

Cited By

View all
  • (2024)LION: Fast and High-Resolution Network Kernel Density VisualizationProceedings of the VLDB Endowment10.14778/3648160.364816817:6(1255-1268)Online publication date: 3-May-2024
  • (2023)Scalable Evaluation of Local K-Function for Radius-Accurate Hotspot Detection in Spatial NetworksProceedings of the 31st ACM International Conference on Advances in Geographic Information Systems10.1145/3589132.3625646(1-12)Online publication date: 13-Nov-2023
  • (2023)PyNKDV: An Efficient Network Kernel Density Visualization Library for Geospatial Analytic SystemsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589711(99-102)Online publication date: 4-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 15, Issue 4
December 2021
246 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 December 2021
Published in PVLDB Volume 15, Issue 4

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)LION: Fast and High-Resolution Network Kernel Density VisualizationProceedings of the VLDB Endowment10.14778/3648160.364816817:6(1255-1268)Online publication date: 3-May-2024
  • (2023)Scalable Evaluation of Local K-Function for Radius-Accurate Hotspot Detection in Spatial NetworksProceedings of the 31st ACM International Conference on Advances in Geographic Information Systems10.1145/3589132.3625646(1-12)Online publication date: 13-Nov-2023
  • (2023)PyNKDV: An Efficient Network Kernel Density Visualization Library for Geospatial Analytic SystemsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589711(99-102)Online publication date: 4-Jun-2023
  • (2023)Large-scale Geospatial Analytics: Problems, Challenges, and OpportunitiesCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589401(21-29)Online publication date: 4-Jun-2023
  • (2022)LIBKDVProceedings of the VLDB Endowment10.14778/3554821.355485515:12(3606-3609)Online publication date: 29-Sep-2022
  • (2022)Fast network k-function-based spatial analysisProceedings of the VLDB Endowment10.14778/3551793.355183615:11(2853-2866)Online publication date: 1-Jul-2022
  • (2022)SLAM: Efficient Sweep Line Algorithms for Kernel Density VisualizationProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517823(2120-2134)Online publication date: 10-Jun-2022

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media