More Web Proxy on the site http://driver.im/

research-article

Super Spreader Identification Using Geometric-Min Filter

Authors:

Olufemi O. OdegbileAuthors Info & Claims

IEEE/ACM Transactions on Networking, Volume 30, Issue 1

Pages 299 - 312

https://doi.org/10.1109/TNET.2021.3108033

Published: 31 August 2021 Publication History

Abstract

Super spreader identification has a lot of applications in network management and security monitoring. It is a more difficult problem than heavy hitter identification because flow spread is harder to measure than flow size due to the requirement of duplicate removal. The prior work either incurs heavy memory overhead or requires heavy computations. This paper designs a new super-spreader monitor capable of identifying all flows whose spreads are greater than a user-specified threshold with a probability that can be arbitrarily set. It introduces a generalized geometric hash function, a generalized geometric counter, and a novel geometric-min filter that blocks out the vast majority of small/medium flows from being tracked, allowing us to focus on a small number of flows in which super spreaders are identified. We provide an analytical way of properly setting the system threshold to meet probabilistically guaranteed identification of super spreaders, and implement it on both hardware (FPGA) and software platforms. We perform extensive experiments based on real Internet traffic traces from CAIDA. The results show that with proper parameter settings, the new monitor can identify more than 99% super spreaders with a low memory requirement, better than the prior art.

References

[1]

S. Chen and Y. Tang, “Slowing down internet worms,” in Proc. 24th Int. Conf. Distrib. Comput. Syst., Mar. 2004, pp. 312–319.

[2]

J. Gonget al., “HeavyKeeper: An accurate algorithm for finding top-k elephant flows,” in Proc. USENIX Annu. Tech. Conf. (USENIX ATC). Boston, MA, USA: USENIX Association, 2018, pp. 909–921. [Online]. Available: https://www.usenix.org/conference/atc18/presentation/gong

[3]

R. Ben-Basat, G. Einziger, R. Friedman, and Y. Kassner, “Heavy hitters in streams and sliding windows,” in Proc. 35th Annu. IEEE Int. Conf. Comput. Commun. (IEEE INFOCOM), Apr. 2016, pp. 1–9.

[4]

A. Metwally, D. Agrawal, and A. E. Abbadi, “Efficient computation of frequent and top-k elements in data streams,” in Proc. 10th Int. Conf. Database Theory (ICDT), Jan. 2005, pp. 398–412.

[5]

G. Manku and R. Motwani, “Approximate frequency counts over data streams,” in Proc. VLDB, Aug. 2002, pp. 346–357.

[6]

E. Demaine, A. Lopez-Ortiz, and J. Munro, “Frequency estimation of internet packet streams with limited space,” in Proc. 10th Annu. Eur. Symp. Algorithms (ESA), Sep. 2002, pp. 348–360.

[7]

Z. Liu, A. Manousis, G. Vorsanger, V. Sekar, and V. Braverman, “One sketch to rule them all: Rethinking network flow monitoring with UnivMon,” in Proc. ACM SIGCOMM Conf., Aug. 2016, pp. 101–114.

[8]

T. Yanget al., “Elastic sketch: Adaptive and fast network-wide measurements,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2018, pp. 561–575.

[9]

T. Mori, M. Uchida, R. Kawahara, J. Pan, and S. Goto, “Identifying elephant flows through periodically sampled packets,” in Proc. Conf. Internet Meas., Oct. 2004, pp. 115–120.

[10]

N. Kamiyama and T. Mori, “Simple and accurate identification of high-rate flows by packet sampling,” in Proc. IEEE INFOCOM, Apr. 2006, pp. 1–13.

[11]

Y. Yao, S. Xiong, J. Liao, M. Berry, H. Qi, and Q. Cao, “Identifying frequent flows in large datasets through probabilistic bloom filters,” in Proc. 23rd Int. Symp. Qual. Service, Jun. 2015, pp. 279–288.

[12]

A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True, “Deriving traffic demands for operational IP networks: Methodology and experience,” IEEE/ACM Trans. Netw., vol. 9, no. 3, pp. 265–279, Jun. 2001.

[13]

T. Benson, A. Anand, A. Akella, and M. Zhang, “MicroTE: Fine grained traffic engineering for data centers,” in Proc. 7th Conf. Emerg. Netw. EXperiments Technol. (CoNEXT), 2011, pp. 1–12.

[14]

J. Rasleyet al., “Planck: Millisecond-scale monitoring and control for commodity networks,” ACM SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 407–418, 2015.

[15]

J. Suh, T. T. Kwon, C. Dixon, W. Felter, and J. Carter, “OpenSample: A low-latency, sampling-based measurement platform for commodity SDN,” in Proc. IEEE 34th Int. Conf. Distrib. Comput. Syst., Jun. 2014, pp. 228–237.

[16]

J. A. Copeland, “Flow-based detection of network intrusions,” U.S. Patent 7 185 368, Feb. 27, 2007.

[17]

G. A. Ajaeiya, N. Adalian, I. H. Elhajj, A. Kayssi, and A. Chehab, “Flow-based intrusion detection system for SDN,” in Proc. IEEE Symp. Comput. Commun. (ISCC), Jul. 2017, pp. 787–793.

[18]

R. Chandra, “Network traffic monitoring for search popularity analysis,” U.S. Patent 7 594 011, Sep. 22, 2009.

[19]

S. K. Fayaz, Y. Tobioka, V. Sekar, and M. Bailey, “Bohatei: Flexible and elastic DDos defense,” in Proc. 24th USENIX Secur. Symp. (USENIX Secur.), 2015, pp. 817–832.

[20]

M. Wischenbartet al., “User profile integration made easy: Model-driven extraction and transformation of social network schemas,” in Proc. 21st Int. Conf. Companion World Wide Web (WWW Companion), 2012, pp. 939–948.

[21]

A. Hall, O. Bachmann, R. Büssow, S. Gănceanu, and M. Nunkesser, “Processing a trillion cells per mouse click,” 2012, arXiv:1208.0225. [Online]. Available: http://arxiv.org/abs/1208.0225

[22]

S. Melniket al., “Dremel: Interactive analysis of web-scale datasets,” Proc. VLDB Endowment, vol. 3, nos. 1–2, pp. 330–339, Sep. 2010.

[23]

S. Heule, M. Nunkesser, and A. Hall, “HyperLogLog in practice: Algorithmic engineering of a state of the art cardinality estimation algorithm,” in Proc. 16th Int. Conf. Extending Database Technol. (EDBT), 2013, pp. 683–692.

[24]

P. Lieven and B. Scheuermann, “High-speed per-flow traffic measurement with probabilistic multiplicity counting,” in Proc. IEEE INFOCOM, Mar. 2010, pp. 1–9.

[25]

M. Yoon, T. Li, S. Chen, and J.-K. Peir, “Fit a spread estimator in small memory,” in Proc. 28th Conf. Comput. Commun. (IEEE INFOCOM), Apr. 2009, pp. 504–512.

[26]

Q. Xiao, S. Chen, M. Chen, and Y. Ling, “Hyper-compact virtual estimators for big network data based on register sharing,” in Proc. Int. Conf. Meas. Modeling Comput. Syst. (ACM SIGMETRICS), Jun. 2015, pp. 417–428.

[27]

Q. Xiao, Y. Qiao, M. Zhen, and S. Chen, “Estimating the persistent spreads in high-speed networks,” in Proc. IEEE 22nd Int. Conf. Netw. Protocols, Oct. 2014, pp. 131–142.

[28]

C. Estan, G. Varghese, and M. Fisk, “Bitmap algorithms for counting active flows on high-speed links,” IEEE/ACM Trans. Netw., vol. 14, no. 5, pp. 925–937, Oct. 2006.

[29]

Y. Zhou, Y. Zhang, C. Ma, S. Chen, and O. O. Odegbile, “Generalized sketch families for network traffic measurement,” Proc. ACM Meas. Anal. Comput. Syst., vol. 3, no. 3, pp. 1–34, Dec. 2019. 10.1145/3366699.

Digital Library

[30]

M. Yu, L. Jose, and R. Miao, “Software defined traffic measurement with OpenSketch,” in Proc. Symp. Netw. Syst. Design Implement. (USENIX), 2013, pp. 29–42.

[31]

H. Wang, C. Ma, O. O. Odegbile, S. Chen, and J.-K. Peir, “Randomized error removal for online spread estimation in data streaming,” Proc. VLDB Endowment, vol. 14, no. 6, pp. 1040–1052, Feb. 2021.

[32]

C. Ma, H. Wang, O. Odegbile, and S. Chen, “Virtual filter for non-duplicate sampling,” in Proc. IEEE Int. Conf. Netw. Protocols (ICNP), Nov. 2021, pp. 1–11.

[33]

S. Venkataraman, D. Song, P. B. Gibbons, and A. Blum, “New streaming algorithms for fast detection of superspreaders,” in Proc. NDSS, 2005, pp. 1–18.

[34]

S. Ganguly, M. Garofalakis, R. Rastogi, and K. Sabnani, “Streaming algorithms for robust, real-time detection of DDoS attacks,” in Proc. 27th Int. Conf. Distrib. Comput. Syst. (ICDCS), Jun. 2007, p. 4.

[35]

R. B. Basat, X. Chen, G. Einziger, S. L. Feibish, D. Raz, and M. Yu, “Routing oblivious measurement analytics,” in Proc. IFIP Netw. Conf. (Netw.), 2020, pp. 449–457.

[36]

G. Cormode and S. Muthukrishnan, “Space efficient mining of multigraph streams,” in Proc. ACM PODS, 2005, pp. 271–282.

[37]

Y. Liu, W. Chen, and Y. Guan, “Identifying high-cardinality hosts from network-wide traffic measurements,” IEEE Trans. Dependable Secure Comput., vol. 13, no. 5, pp. 547–558, Sep. 2016.

[38]

L. Tang, Q. Huang, and P. P. C. Lee, “SpreadSketch: Toward invertible and network-wide detection of superspreaders,” in Proc. IEEE Conf. Comput. Commun. (IEEE INFOCOM), Jul. 2020, pp. 1608–1617.

[39]

CAIDA. Accessed: 2021. [Online]. Available: https://www.caida.org/data

[40]

Q. Zhao, J. Xu, and Z. Liu, “Design of a novel statistics counter architecture with optimal space and time efficiency,” ACM SIGMETRICS Perform. Eval. Rev., vol. 34, no. 1, pp. 323–334, 2006.

[41]

Y. Lu, A. Montanari, B. Prabhakar, S. Dharmapurikar, and A. Kabbani, “Counter braids: A novel counter architecture for per-flow measurement,” in Proc. ACM SIGMETRICS, Jun. 2008, pp. 121–132.

[42]

Y. Li, R. Miao, C. Kim, and M. Yu, “FlowRadar: A better NetFlow for data centers,” in Proc. USENIX NSDI, 2016, pp. 311–324.

[43]

G. Cormode and S. Muthukrishnan, “An improved data stream summary: The count-min sketch and its applications,” in Proc. LATIN, 2004, pp. 58–75.

[44]

Y. Lu and B. Prabhakar, “Robust counting via counter braids: An error-resilient network measurement architecture,” in Proc. 28th Conf. Comput. Commun. (IEEE INFOCOM), Apr. 2009, pp. 522–530.

[45]

Y. Zhou, Y. Zhou, S. Chen, and Y. Zhang, “Highly compact virtual active counters for per-flow traffic measurement,” in Proc. IEEE Conf. Comput. Commun., Apr. 2018, pp. 1–9.

[46]

T. Li, S. Chen, and Y. Ling, “Per-flow traffic measurement through randomized counter sharing,” IEEE/ACM Trans. Netw., vol. 20, no. 5, pp. 1622–1634, Oct. 2012.

[47]

C. Estan and G. Varghese, “New directions in traffic measurement and accounting,” in Proc. 1st ACM SIGCOMM Workshop Internet Meas. (IMW), 2001, pp. 323–336.

[48]

M. Chen and S. Chen, “Counter tree: A scalable counter architecture for per-flow traffic measurement,” in Proc. IEEE 23rd Int. Conf. Netw. Protocols (ICNP), Nov. 2015, pp. 1249–1262.

[49]

Cisco. (2019). Cisco IOS NetFlow. [Online]. Available: http://www.cisco.com/c/en/us/products/ios-nx-os-software/ios-netflow/index.html

[50]

C. Ma, H. Wang, O. Odegbile, and S. Chen, “Noise measurement and removal for data streaming algorithms with network applications,” in Proc. IFIP Netw. Conf. (IFIP Netw.), Jun. 2021, pp. 1–9.

[51]

Y. Zhou, Y. Zhou, M. Chen, and S. Chen, “Persistent spread measurement for big network data based on register intersection,” in Proc. ACM SIGMETRICS, Jun. 2017, pp. 1–29.

[52]

K.-Y. Whang, B. T. Vander-Zanden, and H. M. Taylor, “A linear-time probabilistic counting algorithm for database applications,” ACM Trans. Database Syst., vol. 15, no. 2, pp. 208–229, Jun. 1990.

[53]

P. Flajolet and G. N. Martin, “Probabilistic counting algorithms for database applications,” J. Comput. Syst. Sci., vol. 31, pp. 182–209, Sep. 1985.

[54]

M. Durand and P. Flajolet, “LogLog counting of large cardinalities,” in Proc. Eur. Symposia Algorithms (ESA), 2003, pp. 605–617.

[55]

P. Flajolet, E. Fusy, O. Gandouet, and F. Meunier, “HyperLogLog: The analysis of a near-optimal cardinality estimation algorithm,” in Proc. AOFA, 2007, pp. 127–146.

[56]

R. Ben Basat, G. Einziger, R. Friedman, M. C. Luizelli, and E. Waisbard, “Constant time updates in hierarchical heavy hitters,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2017, pp. 127–140.

[57]

Q. Huang, P. P. C. Lee, and Y. Bao, “SketchLearn: Relieving user burdens in approximate measurement with automated statistical inference,” in Proc. SIGCOMM, Aug. 2018, pp. 576–590.

[58]

Z. Liuet al., “NitroSketch: Robust and general sketch-based monitoring in software switches,” in Proc. ACM Special Interest Group Data Commun., Aug. 2019, pp. 334–350.

[59]

T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms. Cambridge, MA, USA: MIT Press, 2009.

Digital Library

[60]

G. Einziger and R. Friedman, “Counting with TinyTable: Every bit counts,” IEEE Access, vol. 7, pp. 166292–166309, 2019.

[61]

C. Estan, G. Varghese, and M. Fisk, “Bitmap algorithms for counting active flows on high speed links,” in Proc. ACM SIGCOMM Conf. Internet Meas. (IMC), 2003, pp. 153–166.

[62]

Full Version. Accessed: 2021. [Online]. Available: https://www.dropbox.com/s/lokjzo5w8dea3s8/paper.pdf?dl=0

[63]

VIVADO. Accessed: 2021. [Online]. Available: https://www.xilinx.com/products/design-tools/vivado.html

[64]

Source Code for Geometric-Min Filter. Accessed: 2021. [Online]. Available: https://github.com/mcynever/GeometricMinFilter

Cited By

Wang H(2024)Enhancing Accuracy for Super Spreader Identification in High-Speed Data StreamsProceedings of the VLDB Endowment10.14778/3681954.368198817:11(3124-3137)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681988
Su HXiao Q(2023)Online Detection of 1D and 2D Hierarchical Super-Spreaders in High-Speed NetworksProceedings of the 7th Asia-Pacific Workshop on Networking10.1145/3600061.3600080(109-115)Online publication date: 29-Jun-2023
https://dl.acm.org/doi/10.1145/3600061.3600080
Zhou AQian J(2023)An Adaptive Method for Identifying Super Nodes from Network-wide ViewJournal of Network and Systems Management10.1007/s10922-023-09745-031:3Online publication date: 9-Jun-2023
https://dl.acm.org/doi/10.1007/s10922-023-09745-0
Show More Cited By

Index Terms

Super Spreader Identification Using Geometric-Min Filter
1. Networks
  1. Network services
    1. Network monitoring
2. Security and privacy

Index terms have been assigned to the content through auto-classification.

Recommendations

Enhancing Accuracy for Super Spreader Identification in High-Speed Data Streams

This paper addresses the challenge of identifying super spreaders within large, high-speed data streams. In these streams, data is segmented into flows, with each flow's spread defined as the number of distinct items it contains. A super spreader is ...
Identification of multi-spreader users in social networks for viral marketing

Identifying high spreading power nodes is an interesting problem in social networks. Finding super spreader nodes becomes an arduous task when the nodes appear in large numbers, and the number of existing links becomes enormous among them. One of the ...
Fake spreader is narcissist; Real spreader is Machiavellian prediction of fake news diffusion using psycho-sociological facets
Abstract
Fake news reference to misleading or false articles that is been spread on social network. The diffusion of misleading information on social network causes an impact on civil life of the people in terms of psychological, financial etc.,...
Highlights
- Fake spreaders can be identified by DL model using physiological features.
- Dark ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking

IEEE/ACM Transactions on Networking Volume 30, Issue 1

Feb. 2022

473 pages

ISSN:1063-6692

Issue’s Table of Contents

1558-2566 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 31 August 2021

Published in TON Volume 30, Issue 1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
43
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)2

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang H(2024)Enhancing Accuracy for Super Spreader Identification in High-Speed Data StreamsProceedings of the VLDB Endowment10.14778/3681954.368198817:11(3124-3137)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3681988
Su HXiao Q(2023)Online Detection of 1D and 2D Hierarchical Super-Spreaders in High-Speed NetworksProceedings of the 7th Asia-Pacific Workshop on Networking10.1145/3600061.3600080(109-115)Online publication date: 29-Jun-2023
https://dl.acm.org/doi/10.1145/3600061.3600080
Zhou AQian J(2023)An Adaptive Method for Identifying Super Nodes from Network-wide ViewJournal of Network and Systems Management10.1007/s10922-023-09745-031:3Online publication date: 9-Jun-2023
https://dl.acm.org/doi/10.1007/s10922-023-09745-0
Zhou AQian J(2022)An Efficient Method for Detecting Supernodes Using Reversible Summary Data Structures in the Distributed Monitoring SystemsSecurity and Communication Networks10.1155/2022/32714332022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/3271433
Wang HMa CChen SWang Y(2022)Fast and Accurate Cardinality Estimation by Self-Morphing BitmapsIEEE/ACM Transactions on Networking10.1109/TNET.2022.314720430:4(1674-1688)Online publication date: 10-Feb-2022
https://dl.acm.org/doi/10.1109/TNET.2022.3147204

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents