Combining Stream Processing Engines and Big Data Storages for Data Analysis

Thomas Steinmaurer²²,
Patrick Traxler²²,
Michael Zwick²²,
Reinhard Stumptner²² &
…
Christian Lettner²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8502))

Included in the following conference series:

International Symposium on Methodologies for Intelligent Systems

1632 Accesses
2 Citations

Abstract

We propose a system combining stream processing engines and big data storages for analyzing large amounts of data streams. It allows us to analyze data online and to store data for later offline analysis. An emphasis is laid on designing a system to facilitate simple implementations of data analysis algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Analyzing Big Data Streams with Apache SAMOA

A Comprehensive Review and Open Challenges of Stream Big Data

Towards an Integrated Platform for Big Data Analysis

References

Abadi, D.J., Ahmad, Y., Balazinska, M., Cetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A.S., Rasin, A., Ryvkina, E., et al.: The design of the borealis stream processing engine. In: CIDR (2005)
Google Scholar
Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. The VLDB Journal 12(2), 120–139 (2003)
Article Google Scholar
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2) (2008)
Google Scholar
Chu, C.-T., Kim, S.K., Lin, Y.-A., Yu, Y., Bradski, G.R., Ng, A.Y., Olukotun, K.: Map-Reduce for machine learning on multicore. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) NIPS, pp. 281–288. MIT Press (2006)
Google Scholar
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: Map-Reduce online. In: NSDI, pp. 313–328. USENIX Association (2010)
Google Scholar
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Gerth, J., Talbot, J., Elmeleegy, K., Sears, R.: Online aggregation and continuous query support in mapReduce. In: Elmagarmid, A.K., Agrawal, D. (eds.) SIGMOD Conference, pp. 1115–1118. ACM (2010)
Google Scholar
Dean, J., Ghemawat, S.: Map-Reduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)
Article Google Scholar
EsperTech. Esper – complex event processing. Website (2013) esper.codehaus.org
The Apache Software Foundation. Apache Hadoop. Website (2013), hadoop.apache.org
The Apache Software Foundation. Mahout: Scalable machine-learning and data-mining library (2013) mahout.apache.org
Franklin, M.J., Jeffery, S.R., Krishnamurthy, S., Reiss, F., Rizvi, S., Wu, E., Cooper, O., Edakkunni, A., Hong, W.: Design considerations for high fan-in systems: The HiFi approach. In: CIDR (2005)
Google Scholar
Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Scott, M.L., Peterson, L.L. (eds.) SOSP, pp. 29–43. ACM (2003)
Google Scholar
Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, resource management, and approximation in a data stream management system. In: CIDR (2003)
Google Scholar
Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: Distributed stream computing platform. In: Fan, W., Hsu, W., Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) ICDM Workshops, pp. 170–177. IEEE Computer Society (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Software Competence Center Hagenberg, Austria
Thomas Steinmaurer, Patrick Traxler, Michael Zwick, Reinhard Stumptner & Christian Lettner

Authors

Thomas Steinmaurer
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Traxler
View author publications
You can also search for this author in PubMed Google Scholar
Michael Zwick
View author publications
You can also search for this author in PubMed Google Scholar
Reinhard Stumptner
View author publications
You can also search for this author in PubMed Google Scholar
Christian Lettner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Group PLIS: Programming, Logic and Intelligent Systems Dept. of Communication, Business and Information Technologies, Roskilde University, Denmark
Troels Andreasen & Henning Christiansen &
Department of Computer Science and Artificial Intelligence, CITIC, University of Granada, 18071, Granada, Spain
Juan-Carlos Cubero
University of North Carolina, , , 9201 University City Blvd, Charlotte, NC 28223 USA, and Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, Poland
Zbigniew W. Raś

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Steinmaurer, T., Traxler, P., Zwick, M., Stumptner, R., Lettner, C. (2014). Combining Stream Processing Engines and Big Data Storages for Data Analysis. In: Andreasen, T., Christiansen, H., Cubero, JC., Raś, Z.W. (eds) Foundations of Intelligent Systems. ISMIS 2014. Lecture Notes in Computer Science(), vol 8502. Springer, Cham. https://doi.org/10.1007/978-3-319-08326-1_48

Download citation

DOI: https://doi.org/10.1007/978-3-319-08326-1_48
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08325-4
Online ISBN: 978-3-319-08326-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Combining Stream Processing Engines and Big Data Storages for Data Analysis

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Analyzing Big Data Streams with Apache SAMOA

A Comprehensive Review and Open Challenges of Stream Big Data

Towards an Integrated Platform for Big Data Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Combining Stream Processing Engines and Big Data Storages for Data Analysis

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Analyzing Big Data Streams with Apache SAMOA

A Comprehensive Review and Open Challenges of Stream Big Data

Towards an Integrated Platform for Big Data Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation