BiGeo: A Foundational PaaS Framework for Efficient Storage, Visualization, Management, Analysis, Service, and Migration of Geospatial Big Data—A Case Study of Sichuan Province, China
<p>Overall design of BiGeo. BiGeo is a loosely assembled combination of five frames, each with the same technology stack, which can be combined and split at will. Each part of the framework can be used independently or in an extended manner. The solid box in the figure shows the core content of BiGeo, and the dashed box is the relevant support technology or derivative.</p> "> Figure 2
<p>High-level Greenplum database architecture. Greenplum was a successful commercial OLAP product and is now open source; in addition, owing to the PostgreSQL project, it has complete support for spatial data features.</p> "> Figure 3
<p>Data cluster management. Through the WEB interface, we can remotely start and stop the data cluster and monitor the status of the data nodes. With shell tools, we can also use the full GP tool for cluster maintenance.</p> "> Figure 4
<p>Geographic information system (GIS) engine structure. The name of the GIS engine is imagetile, which was derived from the research results of an actual project. By continuously accumulating support for different data sources and correcting the coordinate and symbol systems, a complete GIS kernel is formed.</p> "> Figure 5
<p>GIS engine sample program. This example embeds the GIS engine in a desktop application to quickly display the province’s water system data in the data cluster, allowing roaming, scaling, and query operations. We can also overlay other data sources, such as mongodb, shapefile, and map service.</p> "> Figure 6
<p>Desktop main interface. The main interface of Desktop includes menus, toolbars, status bars, layer windows, file directory windows, and map views. The current map shows an image of the province’s service superimposed over administrative division vector data, and an administrative district selected to display and analyze its geometric structure.</p> "> Figure 7
<p>Selection of set interaction and node editing of cluster data. By integrating various GIS open-source libraries, we can further complete more GIS functions on the desktop side in the storage and computing capabilities of spatial data clusters.</p> "> Figure 8
<p>Server console. This is the server host. We can set the IP address, port, and resource root directory of the service; start, stop, browse, and test the service; and monitor the registered handler and its processing exception information. The actual service processing host is completed by another node program, which is an entry shell program.</p> "> Figure 9
<p>Bulk migration of data into a data cluster. Through this migration method, the historical data in ArcGIS and Oracle can be conveniently imported into BiGeo for analysis, and the analysis results are exported back to the original software for continued application.</p> "> Figure 10
<p>Overview of surface cover data in Sichuan province. The data are full coverage data. The vector data interpreted from the image data include all types of surface features in the province. The geometric complexity of the features exceeds the contour and vegetation features.</p> "> Figure 11
<p>Comparison of different software spatial analysis efficiencies. The amount of data is 2.3 million, with 2,292,765 rows of water systems data for a 3100 km long and 20 km wide buffer area. (<b>a</b>) Buffer analysis (s). The stand-alone (single-threaded) CPU usage of ArcGIS is approximately 35%, the single-machine (eight parallel tasks) CPU usage is 100%. (<b>b</b>) Buffer clipping (s). ArcGIS uses the toolbox clip tool, and a single machine does not optimize within 636.7 s, single machine with eight parallel tasks does not optimize within 637.5 s (in this way, despite the eight parallel settings, single-threaded tasks are actually still conducted), and stand-alone ArcGIS (single thread CPU usage is approximately 35%, and the single-machine (with eight parallel tasks) CPU usage is 100%. (<b>c</b>) Statistical area with real-time calculation. The stand-alone machine (eight parallel tasks) is in fact single-threaded, with no parallelism. The precalculated area is calculated by first calculating the area as an attribute for each feature. The statistical area accumulates the area properties of each feature.</p> "> Figure 11 Cont.
<p>Comparison of different software spatial analysis efficiencies. The amount of data is 2.3 million, with 2,292,765 rows of water systems data for a 3100 km long and 20 km wide buffer area. (<b>a</b>) Buffer analysis (s). The stand-alone (single-threaded) CPU usage of ArcGIS is approximately 35%, the single-machine (eight parallel tasks) CPU usage is 100%. (<b>b</b>) Buffer clipping (s). ArcGIS uses the toolbox clip tool, and a single machine does not optimize within 636.7 s, single machine with eight parallel tasks does not optimize within 637.5 s (in this way, despite the eight parallel settings, single-threaded tasks are actually still conducted), and stand-alone ArcGIS (single thread CPU usage is approximately 35%, and the single-machine (with eight parallel tasks) CPU usage is 100%. (<b>c</b>) Statistical area with real-time calculation. The stand-alone machine (eight parallel tasks) is in fact single-threaded, with no parallelism. The precalculated area is calculated by first calculating the area as an attribute for each feature. The statistical area accumulates the area properties of each feature.</p> "> Figure 12
<p>Desktop used for clips, statistics, and query data on the data cluster. (<b>a</b>) Spatial analysis and statistics of distributed massive vector data. There are many spatial analysis and statistical functions that can be used. These operations can also interact with each other for a more sophisticated analysis. (<b>b</b>) Query and management of distributed massive image data. Through the data cluster and file server, the images can be managed and queried in an integrated manner; in addition, the image entity files are not put into the database, and the original image directory does not need to be modified, and can be used directly.</p> "> Figure 13
<p>The distribution of various functions of the desktop. At present, the distribution of various functions is relatively balanced, covering almost all aspects of desktop GIS applications. All features are tailored to data clusters and enable a rapid integration of geospatial data in a distributed environment.</p> "> Figure 14
<p>The server can be used to quickly publish vector and image data on the data cluster (on-the-fly). (<b>a</b>) Viewing vector data in a data cluster published on the server side in real time at a small scale. The viewed data can be automatically cached, and thus the more browsing visits that occur, the more efficient the service response is. The initial server needs to request the data entity in the data cluster. (<b>b</b>) Visiting a global entity image of Dazhou City, Sichuan Province, released by the server. Unlike vector dynamic rendering, the efficiency of accessing physical images depends mainly on the IO speed of the network and the hard disk.</p> "> Figure 14 Cont.
<p>The server can be used to quickly publish vector and image data on the data cluster (on-the-fly). (<b>a</b>) Viewing vector data in a data cluster published on the server side in real time at a small scale. The viewed data can be automatically cached, and thus the more browsing visits that occur, the more efficient the service response is. The initial server needs to request the data entity in the data cluster. (<b>b</b>) Visiting a global entity image of Dazhou City, Sichuan Province, released by the server. Unlike vector dynamic rendering, the efficiency of accessing physical images depends mainly on the IO speed of the network and the hard disk.</p> "> Figure 15
<p>Comparison of response speeds of different map service types. As shown, vector dynamic rendering requires more CPU and memory, which takes the longest amount of time; in addition, image dynamic rendering mainly depends on the disk IO speed, the cached data service efficiency after access can be greatly improved, files can be cached through the client HTTP mechanism and greatly alleviate the server pressure, and the compressed data has better comprehensive space occupation, access speed, and cost performance.</p> "> Figure 16
<p>Trend interpolation line. As the size of the packet increases and the complexity of the features increases, the import speed will decrease significantly, and a slow decay rate will be maintained at an inflection point.</p> ">
Abstract
:1. Introduction
2. Related Studies
2.1. GIS Software
2.2. Geospatial Database
2.3. Cloud Solution
2.4. Distributed Computing Framework
2.5. Global and Regional Case
3. Design and Architecture
3.1. Users and Main Usage Scenarios
3.2. Functional and Nonfunctional Requirements
3.3. Architectural Choices
3.4. Technological Choices
4. Implementation
4.1. Data Clustering
4.2. GIS Engine
4.3. Desktop
4.4. Server
4.5. Toolkit
5. Case Studies
5.1. Case Study 1: Conducting Complex Spatial SQL Operations on Data Clusters
5.2. Case Study 2: Spatial Analysis and Statistics of Geographic Information Data on the Desktop
5.3. Case Study 3: Publish a Dynamic Map Service Using the Server
5.4. Case Study 4: Quickly Migrating and Converting Large Amounts of Data Using the Toolkit
6. Conclusions and Future Work
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Siddiqa, A.; Karim, A.; Gani, A. Big data storage technologies: A survey. Front. Inf. Technol. Electron. Eng. 2017, 18, 1040–1070. [Google Scholar] [CrossRef]
- Kamilaris, A.; Kartakoullis, A.; Prenafeta-Boldú, F.X. A review on the practice of big data analysis in agriculture. Comput. Electron. Agric. 2017, 143, 23–37. [Google Scholar] [CrossRef]
- Leidig, M.; Teeuw, R. Free software: A review, in the context of disaster management. Int. J. Appl. Earth Obs. Geoinf. 2015, 42, 49–56. [Google Scholar] [CrossRef] [Green Version]
- Jakimavičius, M.; Palevičius, V.; Antuchevičiene, J.; Karpavičius, T. Internet GIS-Based Multimodal Public Transport Trip Planning Information System for Travelers in Lithuania. ISPRS Int. J. Geo-Inf. 2019, 8, 319. [Google Scholar] [CrossRef]
- Huang, Q.; Cervone, G.; Zhang, G. A cloud-enabled automatic disaster analysis system of multi-sourced data streams: An example synthesizing social media, remote sensing and Wikipedia data. Comput. Environ. Urban Syst. 2017, 66, 23–37. [Google Scholar] [CrossRef]
- Sapountzi, A.; Psannis, K.E. Social networking data analysis tools & challenges. Future Gener. Comput. Syst. 2018, 86, 893–913. [Google Scholar]
- Fritz, S.; McCallum, I.; Schill, C.; Perger, C.; See, L.; Schepaschenko, D.; van der Velde, M.; Kraxner, F.; Obersteiner, M. Geo-Wiki: An online platform for improving global land cover. Environ. Model. Softw. 2012, 31, 110–123. [Google Scholar] [CrossRef]
- Pourebrahim, N.; Sultana, S.; Niakanlahiji, A.; Thill, J.-C. Trip distribution modeling with Twitter data. Comput. Environ. Urban Syst. 2019, 77, 101354. [Google Scholar] [CrossRef]
- Dos Santos, R.F.; Boedihardjo, A.; Shah, S.; Chen, F.; Lu, C.T.; Ramakrishnan, N. The big data of violent events: Algorithms for association analysis using spatio-temporal storytelling. GeoInformatica 2016, 20, 879–921. [Google Scholar] [CrossRef]
- Jiang, B. Geospatial analysis requires a different way of thinking: The problem of spatial heterogeneity. GeoJournal 2015, 80, 1–13. [Google Scholar] [CrossRef]
- Chmielewski, S.; Samulowska, M.; Lupa, M.; Lee, D.; Zagajewski, B. Citizen science and WebGIS for outdoor advertisement visual pollution assessment. Comput. Environ. Urban Syst. 2018, 67, 97–109. [Google Scholar] [CrossRef]
- Repetto, M.P.; Burlando, M.; Solari, G.; De Gaetano, P.; Pizzo, M.; Tizzi, M. A web-based GIS platform for the safe management and risk assessment of complex structural and infrastructural systems exposed to wind. Adv. Eng. Softw. 2018, 117, 29–45. [Google Scholar] [CrossRef]
- Rafoss, T.; Sælid, K.; Sletten, A.; Gyland, L.F.; Engravslia, L. Open geospatial technology standards and their potential in plant pest risk management-GPS-enabled mobile phones utilising open geospatial technology standards Web Feature Service Transactions support the fighting of fire blight in Norway. Comput. Electron. Agric. 2010, 74, 336–340. [Google Scholar] [CrossRef]
- Machwitz, M.; Hass, E.; Junk, J.; Udelhoven, T.; Schlerf, M. CropGIS—A web application for the spatial and temporal visualization of past, present and future crop biomass development. Comput. Electron. Agric. 2019, 161, 185–193. [Google Scholar] [CrossRef]
- Kingdon, A.; Nayembil, M.L.; Richardson, A.E.; Smith, A.G. A geodata warehouse: Using denormalisation techniques as a tool for delivering spatially enabled integrated geological information to geologists. Comput. Geosci. 2016, 96, 87–97. [Google Scholar] [CrossRef] [Green Version]
- Seo, B.C.; Keem, M.; Hammond, R.; Demir, I.; Krajewski, W.F. A pilot infrastructure for searching rainfall metadata and generating rainfall product using the big data of NEXRAD. Environ. Model. Softw. 2019, 117, 69–75. [Google Scholar] [CrossRef]
- Ben Brahim, M.; Drira, W.; Filali, F.; Hamdi, N. Spatial data extension for Cassandra NoSQL database. J. Big Data 2016, 3, 11. [Google Scholar] [CrossRef]
- Kwakkel, J.H.; Carley, S.; Chase, J.; Cunningham, S.W. Visualizing geo-spatial data in science, technology and innovation. Technol. Forecast. Soc. Chang. 2014, 81, 67–81. [Google Scholar] [CrossRef]
- Zhang, X.; Yue, P.; Chen, Y.; Hu, L. An efficient dynamic volume rendering for large-scale meteorological data in a virtual globe. Comput. Geosci. 2019, 126, 1–8. [Google Scholar] [CrossRef]
- Hardebol, N.J.; Bertotti, G. DigiFract: A software and data model implementation for flexible acquisition and processing of fracture data from outcrops. Comput. Geosci. 2013, 54, 326–336. [Google Scholar] [CrossRef]
- Liao, C.; Brown, D.; Fei, D.; Long, X.; Chen, D.; Che, S. Big data-enabled social sensing in spatial analysis: Potentials and pitfalls. Trans. GIS 2018, 22, 1351–1371. [Google Scholar] [CrossRef]
- Yu, J.J.; Qin, X.S.; Larsen, L.C.; Larsen, O.; Jayasooriya, A.; Shen, X.L. A GIS-based management and publication framework for data handling of numerical model results. Adv. Eng. Softw. 2012, 45, 360–369. [Google Scholar] [CrossRef]
- Smith, D.A. Online interactive thematic mapping: Applications and techniques for socio-economic research. Comput. Environ. Urban Syst. 2016, 57, 106–117. [Google Scholar] [CrossRef] [Green Version]
- Giuliani, G.; Nativi, S.; Lehmann, A.; Ray, N. WPS mediation: An approach to process geospatial data on different computing backends. Comput. Geosci. 2012, 47, 20–33. [Google Scholar] [CrossRef]
- Moncrieff, S.; Turdukulov, U.; Gulland, E.K. Integrating geo web services for a user driven exploratory analysis. ISPRS J. Photogramm. Remote Sens. 2016, 114, 294–305. [Google Scholar] [CrossRef]
- Zhao, L.; Liu, Z.; Mbachu, J. Highway alignment optimization: An integrated BIM and GIS approach. ISPRS Int. J. Geo-Inf. 2019, 8, 172. [Google Scholar] [CrossRef]
- Huang, W.; Raza, S.A.; Mirzov, O.; Harrie, L. Assessment and Benchmarking of Spatially Enabled RDF Stores for the Next Generation of Spatial Data Infrastructure. ISPRS Int. J. Geo-Inf. 2019, 8, 310. [Google Scholar] [CrossRef]
- Chen, P.; Shi, W. Measuring the spatial relationship information of multi-Layered vector data. ISPRS Int. J. Geo-Inf. 2018, 7, 88. [Google Scholar] [CrossRef]
- Baumann, P. The OGC web coverage processing service (WCPS) standard. Geoinformatica 2010, 14, 447–479. [Google Scholar] [CrossRef]
- Ludwig, B.; Coetzee, S. Implications of security mechanisms and Service Level Agreements (SLAs) of Platform as a Service (PaaS) clouds for geoprocessing services. Appl. Geomat. 2013, 5, 25–32. [Google Scholar] [CrossRef]
- Tang, J.; Matyas, C.J. Arc4nix: A cross-platform geospatial analytical library for cluster and cloud computing. Comput. Geosci. 2018, 111, 159–166. [Google Scholar] [CrossRef]
- Qin, R. Development of a GIS-based integrated framework for coastal seiches monitoring and forecasting: A North Jiangsu shoal case study. Comput. Geosci. 2017, 103, 70–79. [Google Scholar] [CrossRef]
- Li, F.; Gui, Z.; Wu, H.; Gong, J.; Wang, Y.; Tian, S.; Zhang, J. Big enterprise registration data imputation: Supporting spatiotemporal analysis of industries in China. Comput. Environ. Urban Syst. 2018, 70, 9–23. [Google Scholar] [CrossRef]
- Bellini, P.; Nesi, P. Performance assessment of RDF graph databases for smart city services. J. Vis. Lang. Comput. 2018, 45, 24–38. [Google Scholar] [CrossRef]
- Huang, Z.; Chen, Y.; Wan, L.; Peng, X. GeoSpark SQL: An effective framework enabling spatial queries on spark. ISPRS Int. J. Geo-Inf. 2017, 6, 285. [Google Scholar] [CrossRef]
- Qian, C.; Yi, C.; Cheng, C.; Pu, G.; Wei, X.; Zhang, H. Geosot-based spatiotemporal index of massive trajectory data. ISPRS Int. J. Geo-Inf. 2019, 8, 284. [Google Scholar] [CrossRef]
- Jun, S.; Lee, S. Prototype system for geospatial data building-sharing developed by utilizing open source web technology. Spat. Inf. Res. 2017, 25, 725–733. [Google Scholar] [CrossRef]
- Hu, F.; Xu, M.; Yang, J.; Liang, Y.; Cui, K.; Little, M.M.; Lynnes, C.S.; Duffy, D.Q.; Yang, C. Evaluating the Open Source Data Containers for Handling Big Geospatial Raster Data. ISPRS Int. J. Geo-Inf. 2018, 7, 144. [Google Scholar] [CrossRef]
- Višnjevac, N.; Mihajlović, R.; Šoškić, M.; Cvijetinović, Ž.; Bajat, B. Prototype of the 3D cadastral system based on a NoSQL database and a Javascript visualization application. ISPRS Int. J. Geo-Inf. 2019, 8, 227. [Google Scholar] [CrossRef]
- Lyu, L.; Xu, Q.; Lan, C.; Shi, Q.; Lu, W.; Zhou, Y.; Zhao, Y. Sino-inspace: A digital simulation platform for virtual space environments. ISPRS Int. J. Geo-Inf. 2018, 7, 373. [Google Scholar] [CrossRef]
- Zaragozí, B.; Belda, A.; Linares, J.; Martínez-Pérez, J.E.; Navarro, J.T.; Esparza, J. A free and open source programming library for landscape metrics calculations. Environ. Model. Softw. 2012, 31, 131–140. [Google Scholar] [CrossRef]
- Saif, S.; Wazir, S. Performance Analysis of Big Data and Cloud Computing Techniques: A Survey. Procedia Comput. Sci. 2018, 132, 118–127. [Google Scholar] [CrossRef]
- Morsy, M.M.; Goodall, J.L.; O’Neil, G.L.; Sadler, J.M.; Voce, D.; Hassan, G.; Huxley, C. A cloud-based flood warning system for forecasting impacts to transportation infrastructure systems. Environ. Model. Softw. 2018, 107, 231–244. [Google Scholar] [CrossRef]
- Blauth, D.A.; Ducati, J.R. A Web-based system for vineyards management, relating inventory data, vectors and images. Comput. Electron. Agric. 2010, 71, 182–188. [Google Scholar] [CrossRef]
- Bunting, P.; Clewley, D.; Lucas, R.M.; Gillingham, S. The Remote Sensing and GIS Software Library (RSGISLib). Comput. Geosci. 2014, 62, 216–226. [Google Scholar] [CrossRef]
- Appel, M.; Lahn, F.; Buytaert, W.; Pebesma, E. Open and scalable analytics of large Earth observation datasets: From scenes to multidimensional arrays using SciDB and GDAL. ISPRS J. Photogramm. Remote Sens. 2018, 138, 47–56. [Google Scholar] [CrossRef]
- Haynes, D.; Manson, S.; Shook, E. Terra Populus’ architecture for integrated big geospatial services. Trans. GIS 2017, 21, 546–559. [Google Scholar] [CrossRef]
- Meyer, D.; Riechert, M. Open source QGIS toolkit for the Advanced Research WRF modelling system. Environ. Model. Softw. 2019, 112, 166–178. [Google Scholar] [CrossRef]
- Singh, S.K. Evaluating two freely available geocoding tools for geographical inconsistencies and geocoding errors. Open Geospat. Data Softw. Stand. 2017, 2, 11. [Google Scholar] [CrossRef] [Green Version]
- Ballagh, L.M.; Raup, B.H.; Duerr, R.E.; Khalsa, S.J.S.; Helm, C.; Fowler, D.; Gupte, A. Representing scientific data sets in KML: Methods and challenges. Comput. Geosci. 2011, 37, 57–64. [Google Scholar] [CrossRef]
- Saah, D.; Johnson, G.; Ashmall, B.; Tondapu, G.; Tenneson, K.; Patterson, M.; Poortinga, A.; Markert, K.; Quyen, N.H.; San Aung, K.; et al. Collect Earth: An online tool for systematic reference data collection in land cover and use applications. Environ. Model. Softw. 2019, 118, 166–171. [Google Scholar] [CrossRef]
- Li, W.; Wu, S.; Song, M.; Zhou, X. A scalable cyberinfrastructure solution to support big data management and multivariate visualization of time-series sensor observation data. Earth Sci. Inform. 2016, 9, 449–464. [Google Scholar] [CrossRef]
- Jo, J.; Lee, K.W. High-performance geospatial big data processing system based on MapReduce. ISPRS Int. J. Geo-Inf. 2018, 7, 399. [Google Scholar] [CrossRef]
- Patterson, M.T.; Anderson, N.; Bennett, C.; Bruggemann, J.; Grossman, R.L.; Handy, M.; Ly, V.; Mandl, D.J.; Pederson, S.; Pivarski, J.; et al. The Matsu Wheel: A reanalysis framework for Earth satellite imagery in data commons. Int. J. Data Sci. Anal. 2017, 4, 251–264. [Google Scholar] [CrossRef]
- Yu, J.; Zhang, Z.; Sarwat, M. Spatial data management in apache spark: The GeoSpark perspective and beyond. Geoinformatica 2019, 23, 37–78. [Google Scholar] [CrossRef]
- García-García, F.; Corral, A.; Iribarne, L.; Vassilakopoulos, M.; Manolopoulos, Y. Efficient large-scale distance-based join queries in spatialhadoop. Geoinformatica 2018, 22, 171–209. [Google Scholar] [CrossRef]
- Aji, A.; Wang, F.; Vo, H.; Lee, R.; Liu, Q.; Zhang, X.; Saltz, J. Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce. Proc. VLDB Endow. 2013, 6, 1009–1020. [Google Scholar] [CrossRef]
- Alarabi, L.; Mokbel, M.F.; Musleh, M. ST-Hadoop: A MapReduce framework for spatio-temporal data. GeoInformatica 2018, 22, 785–813. [Google Scholar] [CrossRef]
- Huang, W.; Zhang, W.; Zhang, D.; Meng, L. Elastic Spatial Query Processing in OpenStack Cloud Computing Environment for Time-Constraint Data Analysis. ISPRS Int. J. Geo-Inf. 2017, 6, 84. [Google Scholar] [CrossRef]
- Nikitopoulos, P.; Vouros, G.A.; Vlachou, A.; Doulkeridis, C. Parallel and scalable processing of spatio-temporal RDF queries using Spark. GeoInformatica 2019, 1–31. [Google Scholar] [CrossRef]
- Xia, J.; Yang, C.; Li, Q. Building a spatiotemporal index for Earth Observation Big Data. Int. J. Appl. Earth Obs. Geoinf. 2018, 73, 245–252. [Google Scholar] [CrossRef]
- Mazzetti, P.; Roncella, R.; Mihon, D.; Bacu, V.; Lacroix, P.; Guigoz, Y.; Ray, N.; Giuliani, G.; Gorgan, D.; Nativi, S. Integration of data and computing infrastructures for earth science: An image mosaicking use-case. Earth Sci. Inform. 2016, 9, 325–342. [Google Scholar] [CrossRef]
- Teruzzi, A.; Di Cerbo, P.; Cossarini, G.; Pascolo, E.; Salon, S. Parallel implementation of a data assimilation scheme for operational oceanography: The case of the MedBFM model system. Comput. Geosci. 2019, 124, 103–114. [Google Scholar] [CrossRef]
- Zavala-Romero, O.; Ahmed, A.; Chassignet, E.P.; Zavala-Hidalgo, J.; Fernández Eguiarte, A.; Meyer-Baese, A. An open source Java web application to build self-contained web GIS sites. Environ. Model. Softw. 2014, 62, 210–220. [Google Scholar] [CrossRef]
- Criollo, R.; Velasco, V.; Nardi, A.; Manuel de Vries, L.; Riera, C.; Scheiber, L.; Jurado, A.; Brouyère, S.; Pujades, E.; Rossetto, R.; et al. AkvaGIS: An open source tool for water quantity and quality management. Comput. Geosci. 2019, 127, 123–132. [Google Scholar] [CrossRef]
- Rossetto, R.; De Filippis, G.; Borsi, I.; Foglia, L.; Cannata, M.; Criollo, R.; Vázquez-Suñé, E. Integrating free and open source tools and distributed modelling codes in GIS environment for data-based groundwater management. Environ. Model. Softw. 2018, 107, 210–230. [Google Scholar] [CrossRef]
- Lin, W. Geoforum Volunteered Geographic Information constructions in a contested terrain: A case of OpenStreetMap in China. Geoforum 2018, 89, 73–82. [Google Scholar] [CrossRef]
- Xie, Z.; Ye, X.; Zheng, Z.; Li, D.; Sun, L.; Li, R.; Benya, S. Modeling polycentric urbanization using multisource big geospatial data. Remote Sens. 2019, 11, 310. [Google Scholar] [CrossRef]
- Galić, Z.; Mešković, E.; Osmanović, D. Distributed processing of big mobility data as spatio-temporal data streams. Geoinformatica 2017, 21, 263–291. [Google Scholar] [CrossRef]
- Kulawiak, M.; Dawidowicz, A.; Pacholczyk, M.E. Analysis of server-side and client-side Web-GIS data processing methods on the example of JTS and JSTS using open data from OSM and geoportal. Comput. Geosci. 2019, 129, 26–37. [Google Scholar] [CrossRef]
- Amirian, P.; Alesheikh, A.A.; Bassiri, A. Standards-based, interoperable services for accessing urban services data for the city of Tehran. Comput. Environ. Urban Syst. 2010, 34, 309–321. [Google Scholar] [CrossRef]
- Ma, X. Linked Geoscience Data in practice: Where W3C standards meet domain knowledge, data visualization and OGC standards. Earth Sci. Inform. 2017, 10, 429–441. [Google Scholar] [CrossRef]
- Song, J.; Di, L. Near-real-time OGC catalogue service for geoscience big data. ISPRS Int. J. Geo-Inf. 2017, 6, 337. [Google Scholar] [CrossRef]
- Horsburgh, J.S.; Reeder, S.L. Data visualization and analysis within a Hydrologic Information System: Integrating with the R statistical computing environment. Environ. Model. Softw. 2014, 52, 51–61. [Google Scholar] [CrossRef]
- Ames, D.P.; Horsburgh, J.S.; Cao, Y.; Kadlec, J.; Whiteaker, T.; Valentine, D. HydroDesktop: Web services-based software for hydrologic data discovery, download, visualization, and analysis. Environ. Model. Softw. 2012, 37, 146–156. [Google Scholar] [CrossRef]
- Gao, F.; Yue, P.; Zhang, C.; Wang, M. Coupling components and services for integrated environmental modelling. Environ. Model. Softw. 2019, 118, 14–22. [Google Scholar] [CrossRef]
- Lucas, G.; Lénárt, C.; Solymosi, J. Development and testing of geo-processing models for the automatic generation of remediation plan and navigation data to use in industrial disaster remediation. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. ISPRS Arch. 2015, 40, 195–201. [Google Scholar] [CrossRef]
- Li, R.; Dong, G.; Jiang, J.; Wu, H.; Yang, N.; Chen, W. Self-adaptive load-balancing strategy based on a time series pattern for concurrent user access on Web map service. Comput. Geosci. 2019, 131, 60–69. [Google Scholar] [CrossRef]
- Eirinaki, M.; Dhar, S.; Mathur, S.; Kaley, A.; Patel, A.; Joshi, A.; Shah, D. A building permit system for smart cities: A cloud-based framework. Comput. Environ. Urban Syst. 2018, 70, 175–188. [Google Scholar] [CrossRef]
- Boulekrouche, B.; Jabeur, N.; Alimazighi, Z. Toward integrating grid and cloud-based concepts for an enhanced deployment of spatial data warehouses in cyber-physical system applications. J. Ambient Intell. Humaniz. Comput. 2016, 7, 475–487. [Google Scholar] [CrossRef]
- Bimonte, S.; Boucelma, O.; Machabert, O.; Sellami, S. A new Spatial OLAP approach for the analysis of Volunteered Geographic Information. Comput. Environ. Urban Syst. 2014, 48, 111–123. [Google Scholar] [CrossRef]
Software Name | Main Uses |
---|---|
Data Clustering | Basic data cluster software, distributed spatial database based on cheap x86 servers, high-performance distributed storage computing cluster. The function is similar to that of traditional Oracle Exadata with Spatial, Hadoop, and Spark, but has significantly improved efficiency and more spatial characteristics. |
GIS Engine | The GIS engine component, namely, the kernel component responsible for map organization and visualization, is the underlying component of other parts and the secondary development interface. The function is similar to that of traditional MapObjects, SharpMap, and Mapnik, but can quickly visualize distributed spatial data and fit with traditional spatial data. |
Desktop | Desktop-based basic software, desktop C/S Framework GIS Software. The function is similar to that of traditional ArcMap, QGIS, and Udig [65,66], whereas the distributed spatial data can be managed and analyzed flexibly through the plug-in mechanism. |
Server | Server and front and back-end separation of geographic information service publishing basic software under B/S framework. The function is similar to that of traditional ArcServer, GeoServer, MapServer, and Nginx [44], but can quickly publish services for distributed spatial data, exposing various analysis capabilities of distributed spatial data clusters to service interfaces. |
Toolkit | Toolset, Support ETL software. Data conversion, migration aids, etc. The function is similar to that of traditional FME, GeoKettle, and ModelBuilder, and can quickly integrate and migrate existing spatial data results into a distributed data cluster environment. |
Data Cluster | Master | OS Centos 7.3.1661, CPU 8 core, Memory 8 GB, IP 10.51.60.30 | |
Worker × 4 | OS | Centos 7.3.1661 | |
CPU | E5-2660 Xeon 8 core 16 thread | ||
Memory | Samsung DDR3 16G RECC1600 × 4 = 64 GB | ||
Hard disk | Samsung 850 EVO 500 GB × 3 | ||
Network | mainboard Integrated gigabit network card | ||
IP | 10.51.60.21-24 | ||
Server | OS Windows Server 2008 R8, CPU Intel i7-7600 8 core, Memory 16 GB, IP 10.51.60.191 | ||
Desktop & Toolkit | OS Windows 7 SP1, CPU Intel i5-2400 4 core, Memory 12 GB, IP 10.51.60.30.150 | ||
Dev & Debug | OS Windows 7 SP1, CPU Intel i7-4500 4 core, Memory 8 GB |
Layer Type | Layer Use |
---|---|
DynamicLayer | Implement this layer through inheritance to achieve dynamic feature effects |
GdbLayer | Directly retrieve file-type GDB data sources through the FileGDB API |
ImageLayer | Image layer, call remote sensing image data sources via Gdal |
LabelLayer | Automatically label feature attributes based on automatic avoidance algorithms and annotation symbols |
MbtileLayer | Retrieve the SQLite tile dataset that conforms to the mbtiles specification |
MongoLayer | MongoDB layer, retrieve the vector data source stored in mongdb |
NtsLayer | Nts layer, pure .net read shp file data source |
PgLayer | Pg layer, call PostgreSQL/ Greenplum with PostGIS and vector data of similar data source systems via pg driver |
TileLayer | Tile file layer, call local tile file data source that conforms to ArcGIS rules |
VectorLayer | Vector layer, call shp, mdb, osm vector data source via ogr |
WmtsLayer | Tile service layer, call tile service that conforms to the wmts specification |
Symbol Type | Symbolic Use |
---|---|
SimplePointSymbol | Simple dot symbol, implemented in three types of parameter symbols: block, circle, and image |
SimpleLineSymbol | Simple line symbol, implemented in three types of parameter symbols: solid, template, and image lines |
SimpleFillSymbol | Simple area symbol, implemented in three types of parameter symbols: fill color, template pattern, and image pattern |
MultiSymbol | Combination symbol, implemented through a combination of various symbols by layer to form complex symbol drawing |
Type | Parameters | Output | ||||
---|---|---|---|---|---|---|
Geom | geom2 | sr | sr2 | Tolerance | ||
Area | √ | \ | ○ | \ | \ | Double |
Boundary | √ | \ | ○ | \ | \ | WKT string |
Envelope | √ | \ | ○ | \ | \ | Doubles |
Buffer | √ | \ | ○ | \ | √ | WKT string |
Centroid | √ | \ | ○ | \ | \ | WKT string |
Contains | √ | √ | ○ | ○ | \ | Bool |
ConvexHull | √ | \ | ○ | \ | \ | WKT string |
Crosses | √ | √ | ○ | ○ | \ | Bool |
Difference | √ | √ | ○ | ○ | \ | Bool |
Disjoint | √ | √ | ○ | ○ | \ | Bool |
Distance | √ | √ | ○ | ○ | \ | Double |
Equal | √ | √ | ○ | ○ | \ | Bool |
GML | √ | \ | ○ | \ | \ | GML string |
JSON | √ | \ | ○ | \ | \ | JSON string |
KML | √ | \ | ○ | \ | \ | KML string |
Wkb | √ | \ | ○ | \ | \ | Byte |
Intersect | √ | √ | ○ | ○ | \ | Bool |
Intersection | √ | √ | ○ | ○ | \ | WKT string |
Length | √ | \ | ○ | \ | \ | Double |
Overlaps | √ | √ | ○ | ○ | \ | Bool |
PointOnSurface | √ | √ | ○ | ○ | \ | WKT string |
Simplify | √ | \ | ○ | \ | √ | WKT string |
SymDifference | √ | √ | ○ | ○ | \ | WKT string |
Touches | √ | √ | ○ | ○ | \ | Bool |
TransformTo | √ | ○ | √ | √ | \ | WKT string |
Union | √ | √ | ○ | ○ | \ | WKT string |
Within | √ | √ | ○ | ○ | \ | Bool |
Analysis Type | Time (s) | Return Records | Description |
---|---|---|---|
count | 0.709 | 21,409,520 | |
count by region | 1.375 | 181 | count the total number of results for each region by region code |
within | 0.834 | 518,137 | only returned CC and id |
cutting | 29.285 | 521,316 | use a polygon to cut and return the geometry |
separate areas of all data | 47.115 | 21,409,520 | real-time dynamic projection in areas, returning a single area |
sum of all data areas | 44.054 | 1 | including dynamic projection, statistical area; the result only returns the total |
count attribute field | 0.784 | 1 | the shape_Leng value of the original data during data entry; the result returns only one total |
total area after cutting | 5.861 | 1 | including cutting, dynamic projection, statistical area; the result returns the total area |
count the area with the largest number of features | 1.354 | 1 | the number of records is 373,230 |
search for a point in which polygon | 0.012 | 1 | enter a point coordinate to determine which polygon the point falls in |
Complexity | Feature Type | Average Speed | Bulk Packet Size |
---|---|---|---|
very complex features | surface coverage, contour lines, vegetation | 80,000/min | packet 30 MB/1000 |
complex features | roads, water systems, administrative districts | 150,000/min | packet 15 MB/1000 |
general features | house, ancillary facilities | 300,000/min | packet 5 MB/1000 |
simple features | points of interest, joint maps, grids, metadata | 600,000/min | packet 1 MB/1000 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, X.; Hao, L.; Yang, W. BiGeo: A Foundational PaaS Framework for Efficient Storage, Visualization, Management, Analysis, Service, and Migration of Geospatial Big Data—A Case Study of Sichuan Province, China. ISPRS Int. J. Geo-Inf. 2019, 8, 449. https://doi.org/10.3390/ijgi8100449
Liu X, Hao L, Yang W. BiGeo: A Foundational PaaS Framework for Efficient Storage, Visualization, Management, Analysis, Service, and Migration of Geospatial Big Data—A Case Study of Sichuan Province, China. ISPRS International Journal of Geo-Information. 2019; 8(10):449. https://doi.org/10.3390/ijgi8100449
Chicago/Turabian StyleLiu, Xi, Lina Hao, and Wunian Yang. 2019. "BiGeo: A Foundational PaaS Framework for Efficient Storage, Visualization, Management, Analysis, Service, and Migration of Geospatial Big Data—A Case Study of Sichuan Province, China" ISPRS International Journal of Geo-Information 8, no. 10: 449. https://doi.org/10.3390/ijgi8100449
APA StyleLiu, X., Hao, L., & Yang, W. (2019). BiGeo: A Foundational PaaS Framework for Efficient Storage, Visualization, Management, Analysis, Service, and Migration of Geospatial Big Data—A Case Study of Sichuan Province, China. ISPRS International Journal of Geo-Information, 8(10), 449. https://doi.org/10.3390/ijgi8100449