[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Super Spreader Identification Using Geometric-Min Filter

Published: 31 August 2021 Publication History

Abstract

Super spreader identification has a lot of applications in network management and security monitoring. It is a more difficult problem than heavy hitter identification because flow spread is harder to measure than flow size due to the requirement of duplicate removal. The prior work either incurs heavy memory overhead or requires heavy computations. This paper designs a new super-spreader monitor capable of identifying all flows whose spreads are greater than a user-specified threshold with a probability that can be arbitrarily set. It introduces a generalized geometric hash function, a generalized geometric counter, and a novel geometric-min filter that blocks out the vast majority of small/medium flows from being tracked, allowing us to focus on a small number of flows in which super spreaders are identified. We provide an analytical way of properly setting the system threshold to meet probabilistically guaranteed identification of super spreaders, and implement it on both hardware (FPGA) and software platforms. We perform extensive experiments based on real Internet traffic traces from CAIDA. The results show that with proper parameter settings, the new monitor can identify more than 99% super spreaders with a low memory requirement, better than the prior art.

References

[1]
S. Chen and Y. Tang, “Slowing down internet worms,” in Proc. 24th Int. Conf. Distrib. Comput. Syst., Mar. 2004, pp. 312–319.
[2]
J. Gonget al., “HeavyKeeper: An accurate algorithm for finding top-k elephant flows,” in Proc. USENIX Annu. Tech. Conf. (USENIX ATC). Boston, MA, USA: USENIX Association, 2018, pp. 909–921. [Online]. Available: https://www.usenix.org/conference/atc18/presentation/gong
[3]
R. Ben-Basat, G. Einziger, R. Friedman, and Y. Kassner, “Heavy hitters in streams and sliding windows,” in Proc. 35th Annu. IEEE Int. Conf. Comput. Commun. (IEEE INFOCOM), Apr. 2016, pp. 1–9.
[4]
A. Metwally, D. Agrawal, and A. E. Abbadi, “Efficient computation of frequent and top-k elements in data streams,” in Proc. 10th Int. Conf. Database Theory (ICDT), Jan. 2005, pp. 398–412.
[5]
G. Manku and R. Motwani, “Approximate frequency counts over data streams,” in Proc. VLDB, Aug. 2002, pp. 346–357.
[6]
E. Demaine, A. Lopez-Ortiz, and J. Munro, “Frequency estimation of internet packet streams with limited space,” in Proc. 10th Annu. Eur. Symp. Algorithms (ESA), Sep. 2002, pp. 348–360.
[7]
Z. Liu, A. Manousis, G. Vorsanger, V. Sekar, and V. Braverman, “One sketch to rule them all: Rethinking network flow monitoring with UnivMon,” in Proc. ACM SIGCOMM Conf., Aug. 2016, pp. 101–114.
[8]
T. Yanget al., “Elastic sketch: Adaptive and fast network-wide measurements,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2018, pp. 561–575.
[9]
T. Mori, M. Uchida, R. Kawahara, J. Pan, and S. Goto, “Identifying elephant flows through periodically sampled packets,” in Proc. Conf. Internet Meas., Oct. 2004, pp. 115–120.
[10]
N. Kamiyama and T. Mori, “Simple and accurate identification of high-rate flows by packet sampling,” in Proc. IEEE INFOCOM, Apr. 2006, pp. 1–13.
[11]
Y. Yao, S. Xiong, J. Liao, M. Berry, H. Qi, and Q. Cao, “Identifying frequent flows in large datasets through probabilistic bloom filters,” in Proc. 23rd Int. Symp. Qual. Service, Jun. 2015, pp. 279–288.
[12]
A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and F. True, “Deriving traffic demands for operational IP networks: Methodology and experience,” IEEE/ACM Trans. Netw., vol. 9, no. 3, pp. 265–279, Jun. 2001.
[13]
T. Benson, A. Anand, A. Akella, and M. Zhang, “MicroTE: Fine grained traffic engineering for data centers,” in Proc. 7th Conf. Emerg. Netw. EXperiments Technol. (CoNEXT), 2011, pp. 1–12.
[14]
J. Rasleyet al., “Planck: Millisecond-scale monitoring and control for commodity networks,” ACM SIGCOMM Comput. Commun. Rev., vol. 44, no. 4, pp. 407–418, 2015.
[15]
J. Suh, T. T. Kwon, C. Dixon, W. Felter, and J. Carter, “OpenSample: A low-latency, sampling-based measurement platform for commodity SDN,” in Proc. IEEE 34th Int. Conf. Distrib. Comput. Syst., Jun. 2014, pp. 228–237.
[16]
J. A. Copeland, “Flow-based detection of network intrusions,” U.S. Patent 7 185 368, Feb. 27, 2007.
[17]
G. A. Ajaeiya, N. Adalian, I. H. Elhajj, A. Kayssi, and A. Chehab, “Flow-based intrusion detection system for SDN,” in Proc. IEEE Symp. Comput. Commun. (ISCC), Jul. 2017, pp. 787–793.
[18]
R. Chandra, “Network traffic monitoring for search popularity analysis,” U.S. Patent 7 594 011, Sep. 22, 2009.
[19]
S. K. Fayaz, Y. Tobioka, V. Sekar, and M. Bailey, “Bohatei: Flexible and elastic DDos defense,” in Proc. 24th USENIX Secur. Symp. (USENIX Secur.), 2015, pp. 817–832.
[20]
M. Wischenbartet al., “User profile integration made easy: Model-driven extraction and transformation of social network schemas,” in Proc. 21st Int. Conf. Companion World Wide Web (WWW Companion), 2012, pp. 939–948.
[21]
A. Hall, O. Bachmann, R. Büssow, S. Gănceanu, and M. Nunkesser, “Processing a trillion cells per mouse click,” 2012, arXiv:1208.0225. [Online]. Available: http://arxiv.org/abs/1208.0225
[22]
S. Melniket al., “Dremel: Interactive analysis of web-scale datasets,” Proc. VLDB Endowment, vol. 3, nos. 1–2, pp. 330–339, Sep. 2010.
[23]
S. Heule, M. Nunkesser, and A. Hall, “HyperLogLog in practice: Algorithmic engineering of a state of the art cardinality estimation algorithm,” in Proc. 16th Int. Conf. Extending Database Technol. (EDBT), 2013, pp. 683–692.
[24]
P. Lieven and B. Scheuermann, “High-speed per-flow traffic measurement with probabilistic multiplicity counting,” in Proc. IEEE INFOCOM, Mar. 2010, pp. 1–9.
[25]
M. Yoon, T. Li, S. Chen, and J.-K. Peir, “Fit a spread estimator in small memory,” in Proc. 28th Conf. Comput. Commun. (IEEE INFOCOM), Apr. 2009, pp. 504–512.
[26]
Q. Xiao, S. Chen, M. Chen, and Y. Ling, “Hyper-compact virtual estimators for big network data based on register sharing,” in Proc. Int. Conf. Meas. Modeling Comput. Syst. (ACM SIGMETRICS), Jun. 2015, pp. 417–428.
[27]
Q. Xiao, Y. Qiao, M. Zhen, and S. Chen, “Estimating the persistent spreads in high-speed networks,” in Proc. IEEE 22nd Int. Conf. Netw. Protocols, Oct. 2014, pp. 131–142.
[28]
C. Estan, G. Varghese, and M. Fisk, “Bitmap algorithms for counting active flows on high-speed links,” IEEE/ACM Trans. Netw., vol. 14, no. 5, pp. 925–937, Oct. 2006.
[29]
Y. Zhou, Y. Zhang, C. Ma, S. Chen, and O. O. Odegbile, “Generalized sketch families for network traffic measurement,” Proc. ACM Meas. Anal. Comput. Syst., vol. 3, no. 3, pp. 1–34, Dec. 2019. 10.1145/3366699.
[30]
M. Yu, L. Jose, and R. Miao, “Software defined traffic measurement with OpenSketch,” in Proc. Symp. Netw. Syst. Design Implement. (USENIX), 2013, pp. 29–42.
[31]
H. Wang, C. Ma, O. O. Odegbile, S. Chen, and J.-K. Peir, “Randomized error removal for online spread estimation in data streaming,” Proc. VLDB Endowment, vol. 14, no. 6, pp. 1040–1052, Feb. 2021.
[32]
C. Ma, H. Wang, O. Odegbile, and S. Chen, “Virtual filter for non-duplicate sampling,” in Proc. IEEE Int. Conf. Netw. Protocols (ICNP), Nov. 2021, pp. 1–11.
[33]
S. Venkataraman, D. Song, P. B. Gibbons, and A. Blum, “New streaming algorithms for fast detection of superspreaders,” in Proc. NDSS, 2005, pp. 1–18.
[34]
S. Ganguly, M. Garofalakis, R. Rastogi, and K. Sabnani, “Streaming algorithms for robust, real-time detection of DDoS attacks,” in Proc. 27th Int. Conf. Distrib. Comput. Syst. (ICDCS), Jun. 2007, p. 4.
[35]
R. B. Basat, X. Chen, G. Einziger, S. L. Feibish, D. Raz, and M. Yu, “Routing oblivious measurement analytics,” in Proc. IFIP Netw. Conf. (Netw.), 2020, pp. 449–457.
[36]
G. Cormode and S. Muthukrishnan, “Space efficient mining of multigraph streams,” in Proc. ACM PODS, 2005, pp. 271–282.
[37]
Y. Liu, W. Chen, and Y. Guan, “Identifying high-cardinality hosts from network-wide traffic measurements,” IEEE Trans. Dependable Secure Comput., vol. 13, no. 5, pp. 547–558, Sep. 2016.
[38]
L. Tang, Q. Huang, and P. P. C. Lee, “SpreadSketch: Toward invertible and network-wide detection of superspreaders,” in Proc. IEEE Conf. Comput. Commun. (IEEE INFOCOM), Jul. 2020, pp. 1608–1617.
[39]
CAIDA. Accessed: 2021. [Online]. Available: https://www.caida.org/data
[40]
Q. Zhao, J. Xu, and Z. Liu, “Design of a novel statistics counter architecture with optimal space and time efficiency,” ACM SIGMETRICS Perform. Eval. Rev., vol. 34, no. 1, pp. 323–334, 2006.
[41]
Y. Lu, A. Montanari, B. Prabhakar, S. Dharmapurikar, and A. Kabbani, “Counter braids: A novel counter architecture for per-flow measurement,” in Proc. ACM SIGMETRICS, Jun. 2008, pp. 121–132.
[42]
Y. Li, R. Miao, C. Kim, and M. Yu, “FlowRadar: A better NetFlow for data centers,” in Proc. USENIX NSDI, 2016, pp. 311–324.
[43]
G. Cormode and S. Muthukrishnan, “An improved data stream summary: The count-min sketch and its applications,” in Proc. LATIN, 2004, pp. 58–75.
[44]
Y. Lu and B. Prabhakar, “Robust counting via counter braids: An error-resilient network measurement architecture,” in Proc. 28th Conf. Comput. Commun. (IEEE INFOCOM), Apr. 2009, pp. 522–530.
[45]
Y. Zhou, Y. Zhou, S. Chen, and Y. Zhang, “Highly compact virtual active counters for per-flow traffic measurement,” in Proc. IEEE Conf. Comput. Commun., Apr. 2018, pp. 1–9.
[46]
T. Li, S. Chen, and Y. Ling, “Per-flow traffic measurement through randomized counter sharing,” IEEE/ACM Trans. Netw., vol. 20, no. 5, pp. 1622–1634, Oct. 2012.
[47]
C. Estan and G. Varghese, “New directions in traffic measurement and accounting,” in Proc. 1st ACM SIGCOMM Workshop Internet Meas. (IMW), 2001, pp. 323–336.
[48]
M. Chen and S. Chen, “Counter tree: A scalable counter architecture for per-flow traffic measurement,” in Proc. IEEE 23rd Int. Conf. Netw. Protocols (ICNP), Nov. 2015, pp. 1249–1262.
[50]
C. Ma, H. Wang, O. Odegbile, and S. Chen, “Noise measurement and removal for data streaming algorithms with network applications,” in Proc. IFIP Netw. Conf. (IFIP Netw.), Jun. 2021, pp. 1–9.
[51]
Y. Zhou, Y. Zhou, M. Chen, and S. Chen, “Persistent spread measurement for big network data based on register intersection,” in Proc. ACM SIGMETRICS, Jun. 2017, pp. 1–29.
[52]
K.-Y. Whang, B. T. Vander-Zanden, and H. M. Taylor, “A linear-time probabilistic counting algorithm for database applications,” ACM Trans. Database Syst., vol. 15, no. 2, pp. 208–229, Jun. 1990.
[53]
P. Flajolet and G. N. Martin, “Probabilistic counting algorithms for database applications,” J. Comput. Syst. Sci., vol. 31, pp. 182–209, Sep. 1985.
[54]
M. Durand and P. Flajolet, “LogLog counting of large cardinalities,” in Proc. Eur. Symposia Algorithms (ESA), 2003, pp. 605–617.
[55]
P. Flajolet, E. Fusy, O. Gandouet, and F. Meunier, “HyperLogLog: The analysis of a near-optimal cardinality estimation algorithm,” in Proc. AOFA, 2007, pp. 127–146.
[56]
R. Ben Basat, G. Einziger, R. Friedman, M. C. Luizelli, and E. Waisbard, “Constant time updates in hierarchical heavy hitters,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2017, pp. 127–140.
[57]
Q. Huang, P. P. C. Lee, and Y. Bao, “SketchLearn: Relieving user burdens in approximate measurement with automated statistical inference,” in Proc. SIGCOMM, Aug. 2018, pp. 576–590.
[58]
Z. Liuet al., “NitroSketch: Robust and general sketch-based monitoring in software switches,” in Proc. ACM Special Interest Group Data Commun., Aug. 2019, pp. 334–350.
[59]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms. Cambridge, MA, USA: MIT Press, 2009.
[60]
G. Einziger and R. Friedman, “Counting with TinyTable: Every bit counts,” IEEE Access, vol. 7, pp. 166292–166309, 2019.
[61]
C. Estan, G. Varghese, and M. Fisk, “Bitmap algorithms for counting active flows on high speed links,” in Proc. ACM SIGCOMM Conf. Internet Meas. (IMC), 2003, pp. 153–166.
[62]
Full Version. Accessed: 2021. [Online]. Available: https://www.dropbox.com/s/lokjzo5w8dea3s8/paper.pdf?dl=0
[63]
[64]
Source Code for Geometric-Min Filter. Accessed: 2021. [Online]. Available: https://github.com/mcynever/GeometricMinFilter

Cited By

View all
  • (2024)Enhancing Accuracy for Super Spreader Identification in High-Speed Data StreamsProceedings of the VLDB Endowment10.14778/3681954.368198817:11(3124-3137)Online publication date: 30-Aug-2024
  • (2023)Online Detection of 1D and 2D Hierarchical Super-Spreaders in High-Speed NetworksProceedings of the 7th Asia-Pacific Workshop on Networking10.1145/3600061.3600080(109-115)Online publication date: 29-Jun-2023
  • (2023)An Adaptive Method for Identifying Super Nodes from Network-wide ViewJournal of Network and Systems Management10.1007/s10922-023-09745-031:3Online publication date: 9-Jun-2023
  • Show More Cited By

Index Terms

  1. Super Spreader Identification Using Geometric-Min Filter
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image IEEE/ACM Transactions on Networking
      IEEE/ACM Transactions on Networking  Volume 30, Issue 1
      Feb. 2022
      473 pages

      Publisher

      IEEE Press

      Publication History

      Published: 31 August 2021
      Published in TON Volume 30, Issue 1

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 13 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Enhancing Accuracy for Super Spreader Identification in High-Speed Data StreamsProceedings of the VLDB Endowment10.14778/3681954.368198817:11(3124-3137)Online publication date: 30-Aug-2024
      • (2023)Online Detection of 1D and 2D Hierarchical Super-Spreaders in High-Speed NetworksProceedings of the 7th Asia-Pacific Workshop on Networking10.1145/3600061.3600080(109-115)Online publication date: 29-Jun-2023
      • (2023)An Adaptive Method for Identifying Super Nodes from Network-wide ViewJournal of Network and Systems Management10.1007/s10922-023-09745-031:3Online publication date: 9-Jun-2023
      • (2022)An Efficient Method for Detecting Supernodes Using Reversible Summary Data Structures in the Distributed Monitoring SystemsSecurity and Communication Networks10.1155/2022/32714332022Online publication date: 1-Jan-2022
      • (2022)Fast and Accurate Cardinality Estimation by Self-Morphing BitmapsIEEE/ACM Transactions on Networking10.1109/TNET.2022.314720430:4(1674-1688)Online publication date: 10-Feb-2022

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media