Abstract
In many emerging applications, the data to be monitored is of very high volume, dynamic, and distributed, making it infeasible to collect the distinct data streams to a central node and process them there. Often, the monitoring problem consists of determining whether the value of a global function, which depends on the union of all streams, crossed a certain threshold. A great deal of effort is directed at reducing communication overhead by transforming the monitoring of the global function to the testing of local constraints, checked independently at the nodes. Recently, geometric monitoring (GM) proved to be very useful for constructing such local constraints for general (non-linear, non-monotonic) functions. Alas, in all current variants of geometric monitoring, the constraints at all nodes share an identical structure and are, thus, unsuitable for handling heterogeneous streams, which obey different distributions at the distinct nodes. To remedy this, we propose a general approach for geometric monitoring of heterogeneous streams (HGM), which defines constraints tailored to fit the distinct data distributions at the nodes. While optimally selecting the constraints is an NP-hard problem, we provide a practical solution, which seeks to reduce running time by hierarchically clustering nodes with similar data distributions and then solving more, but simpler, optimization problems. Experiments are provided to support the validity of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
DCPR (Data Clustering and Pattern Recognition) Toolbox, http://tinyurl.com/nxospq2
The European air quality database, http://tinyurl.com/ct9bh7x
Abadi, D.J., Ahmad, Y., Balazinska, M., Çetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., Zdonik, S.B.: The design of the borealis stream processing engine. In: CIDR (2005)
Burdakis, S., Deligiannakis, A.: Detecting outliers in sensor networks using the geometric approach. In: ICDE (2012)
Cormode, G.: Algorithms for continuous distributing monitoring: A survey. In: AlMoDEP (2011)
Deshpande, A., Guestrin, C., Madden, S., Hellerstein, J.M., Hong, W.: Model-driven data acquisition in sensor networks. In: VLDB (2004)
Elad, M., Tal, A., Ar, S.: Content based retrieval of vrml objects: an iterative and interactive approach. In: Proceedings of the Sixth Eurographics Workshop on Multimedia 2001 (2002)
Fogel, E., Halperin, D.: Exact and efficient construction of minkowski sums of convex polyhedra with applications. Computer-Aided Design 39(11) (2007)
Garofalakis, M.N., Keren, D., Samoladas, V.: Sketch-based geometric monitoring of distributed stream queries. PVLDB (2013)
Giatrakos, N., Deligiannakis, A., Garofalakis, M.N., Sharfman, I., Schuster, A.: Prediction-based geometric monitoring over distributed data streams. In: SIGMOD (2012)
Gupta, R., Ramamritham, K., Mohania, M.K.: Ratio threshold queries over distributed data sources. In: ICDE (2010)
Kanagal, B., Deshpande, A.: Online filtering, smoothing and probabilistic modeling of streaming data. In: ICDE (2008)
Keren, D., Cooper, D.B., Subrahmonia, J.: Describing complicated objects by implicit polynomials. IEEE Trans. Pattern Anal. Mach. Intell. 16(1) (1994)
Keren, D., Sharfman, I., Schuster, A., Livne, A.: Shape sensitive geometric monitoring. IEEE Trans. Knowl. Data Eng. 24(8) (2012)
Kogan, J.: Feature selection over distributed data streams through optimization. In: SDM (2012)
Kurpius, M.R., Goldstein, A.H.: Gas-phase chemistry dominates o3 loss to a forest, implying a source of aerosols and hydroxyl radicals to the atmosphere. Geophysical Research Letters 30(7) (2007)
Papapetrou, O., Garofalakis, M.N., Deligiannakis, A.: Sketch-based querying of distributed sliding-window data streams. PVLDB 5(10) (2012)
Sagy, G., Keren, D., Sharfman, I., Schuster, A.: Distributed threshold querying of general functions by a difference of monotonic representation. PVLDB 4(2) (2010)
Serra, J.P.: Image analysis and mathematical morphology. Academic Press, London (1982)
Shah, S., Ramamritham, K.: Handling non-linear polynomial queries over dynamic data. In: ICDE (2008)
Sharfman, I., Schuster, A., Keren, D.: A geometric approach to monitoring threshold functions over distributed data streams. ACM Trans. Database Syst. 32(4) (2007)
Sharfman, I., Schuster, A., Keren, D.: Shape sensitive geometric monitoring. In: PODS (2008)
Tang, M., Li, F., Phillips, J.M., Jestes, J.: Efficient threshold monitoring for distributed probabilistic data. In: ICDE (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Keren, D. et al. (2014). Monitoring Distributed, Heterogeneous Data Streams: The Emergence of Safe Zones. In: Gupta, P., Zaroliagis, C. (eds) Applied Algorithms. ICAA 2014. Lecture Notes in Computer Science, vol 8321. Springer, Cham. https://doi.org/10.1007/978-3-319-04126-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-04126-1_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04125-4
Online ISBN: 978-3-319-04126-1
eBook Packages: Computer ScienceComputer Science (R0)