[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Representative Sample Extraction from Web Data Streams

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Included in the following conference series:

  • 1598 Accesses

Abstract

Smart or digital city infrastructures facilitate both decision support and strategic planning with applications such as government services, healthcare, transport and traffic management. Generally, each service generates multiple data streams using different data models and structures. Thus, any form of analysis requires some form of extract-transform-load process normally associated with data warehousing to ensure proper cleaning and integration of heterogeneous datasets. In addition, data produced by these systems may be generated at a rate which cannot be captured completely using standard computing resources. In this paper, we present an ETL system for transport data coupled with a smart data acquisition methodology to extract a subset of data suitable for analysis.

This research is supported by Science Foundation Ireland (SFI) and the Department of Agriculture, Food and Marine on behalf of the Government of Ireland under Grant Number 16/RC/3835.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bergamaschi, S., Guerra, F., Orsini, M., Sartori, C., Vincini, M.: A semantic approach to ETL technologies. Data Knowl. Eng. 70, 717–731 (2011)

    Article  Google Scholar 

  2. Cappellari, P., De Virgilio, R., Maccioni, A., Roantree, M.: A path-oriented RDF index for keyword search query processing. In: Hameurlain, A., Liddle, S.W., Schewe, K.-D., Zhou, X. (eds.) DEXA 2011. LNCS, vol. 6861, pp. 366–380. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23091-2_31

    Chapter  Google Scholar 

  3. Costa, C., Santos, M.Y.: Basis: a big data architecture for smart cities. In: SAI Computing Conference (SAI). IEEE (2016)

    Google Scholar 

  4. DublinBus RTPI Service WSDL site (2019). http://rtpi.dublinbus.ie/DublinBusRTPIService.asmx

  5. Hernández-Muñoz, J.M., et al.: Smart cities at the forefront of the future Internet. In: Domingue, J., et al. (eds.) FIA 2011. LNCS, vol. 6656, pp. 447–462. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20898-0_32

    Chapter  Google Scholar 

  6. Kimball, R., Ross, M.: The Data Warehouse Toolkit, 2nd edn. Wiley, Hoboken (2002)

    Google Scholar 

  7. Nesi, P., Po, L., Viqueira, J.R.R., Trillo-Lado, R.: An integrated smart city platform. In: Szymański, J., Velegrakis, Y. (eds.) IKC 2017. LNCS, vol. 10546, pp. 171–176. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74497-1_17

    Chapter  Google Scholar 

  8. Niinimaki, M., Niemi, T.: An ETL process for OLAP using RDF/OWL ontologies. J. Data Semant. XIII 5530, 97 (2010)

    Article  Google Scholar 

  9. Petrović, M., Vučković, M., Turajlić, N., Babarogić, S., Aničić, N., Marjanović, Z.: Automating ETL processes using the domain-specific modeling approach. Inf. Syst. e-Business Manage. 15, 425–460 (2017)

    Article  Google Scholar 

  10. Roantree, M., Liu, J.: A heuristic approach to selecting views for materialization. Softw. Pract. Exp. 44(10), 1157 (2013)

    Article  Google Scholar 

  11. Romero, O., Abelló, A.: A framework for multidimensional design of data warehouses from ontologies. Data Knowl. Eng. 69, 1138–1157 (2010)

    Article  Google Scholar 

  12. Scriney, M., O’Connor, M.F., Roantree, M.: Integrating online data for smart city data marts. In: Calì, A., Wood, P., Martin, N., Poulovassilis, A. (eds.) BICOD 2017. LNCS, vol. 10365, pp. 23–35. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-60795-5_3

    Chapter  Google Scholar 

  13. Scriney, M., Xing, C., McCarren, A., Roantree, M.: Using a similarity matrix to extract sample web data streams. Dublin City University Online Repository, Article 23435, pp. 1–15 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Scriney .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Scriney, M., Xing, C., McCarren, A., Roantree, M. (2019). Representative Sample Extraction from Web Data Streams. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27615-7_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27614-0

  • Online ISBN: 978-3-030-27615-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics