Abstract
Smart or digital city infrastructures facilitate both decision support and strategic planning with applications such as government services, healthcare, transport and traffic management. Generally, each service generates multiple data streams using different data models and structures. Thus, any form of analysis requires some form of extract-transform-load process normally associated with data warehousing to ensure proper cleaning and integration of heterogeneous datasets. In addition, data produced by these systems may be generated at a rate which cannot be captured completely using standard computing resources. In this paper, we present an ETL system for transport data coupled with a smart data acquisition methodology to extract a subset of data suitable for analysis.
This research is supported by Science Foundation Ireland (SFI) and the Department of Agriculture, Food and Marine on behalf of the Government of Ireland under Grant Number 16/RC/3835.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bergamaschi, S., Guerra, F., Orsini, M., Sartori, C., Vincini, M.: A semantic approach to ETL technologies. Data Knowl. Eng. 70, 717–731 (2011)
Cappellari, P., De Virgilio, R., Maccioni, A., Roantree, M.: A path-oriented RDF index for keyword search query processing. In: Hameurlain, A., Liddle, S.W., Schewe, K.-D., Zhou, X. (eds.) DEXA 2011. LNCS, vol. 6861, pp. 366–380. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23091-2_31
Costa, C., Santos, M.Y.: Basis: a big data architecture for smart cities. In: SAI Computing Conference (SAI). IEEE (2016)
DublinBus RTPI Service WSDL site (2019). http://rtpi.dublinbus.ie/DublinBusRTPIService.asmx
Hernández-Muñoz, J.M., et al.: Smart cities at the forefront of the future Internet. In: Domingue, J., et al. (eds.) FIA 2011. LNCS, vol. 6656, pp. 447–462. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20898-0_32
Kimball, R., Ross, M.: The Data Warehouse Toolkit, 2nd edn. Wiley, Hoboken (2002)
Nesi, P., Po, L., Viqueira, J.R.R., Trillo-Lado, R.: An integrated smart city platform. In: Szymański, J., Velegrakis, Y. (eds.) IKC 2017. LNCS, vol. 10546, pp. 171–176. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74497-1_17
Niinimaki, M., Niemi, T.: An ETL process for OLAP using RDF/OWL ontologies. J. Data Semant. XIII 5530, 97 (2010)
Petrović, M., Vučković, M., Turajlić, N., Babarogić, S., Aničić, N., Marjanović, Z.: Automating ETL processes using the domain-specific modeling approach. Inf. Syst. e-Business Manage. 15, 425–460 (2017)
Roantree, M., Liu, J.: A heuristic approach to selecting views for materialization. Softw. Pract. Exp. 44(10), 1157 (2013)
Romero, O., Abelló, A.: A framework for multidimensional design of data warehouses from ontologies. Data Knowl. Eng. 69, 1138–1157 (2010)
Scriney, M., O’Connor, M.F., Roantree, M.: Integrating online data for smart city data marts. In: Calì, A., Wood, P., Martin, N., Poulovassilis, A. (eds.) BICOD 2017. LNCS, vol. 10365, pp. 23–35. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60795-5_3
Scriney, M., Xing, C., McCarren, A., Roantree, M.: Using a similarity matrix to extract sample web data streams. Dublin City University Online Repository, Article 23435, pp. 1–15 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Scriney, M., Xing, C., McCarren, A., Roantree, M. (2019). Representative Sample Extraction from Web Data Streams. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-27615-7_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27614-0
Online ISBN: 978-3-030-27615-7
eBook Packages: Computer ScienceComputer Science (R0)