[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1247480.1247510acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Fast data stream algorithms using associative memories

Published: 11 June 2007 Publication History

Abstract

The primary goal of data stream research is to develop space and time efficient solutions for answering continuous on-line summarization queries. Research efforts over the last decade have resulted in a number of efficient algorithms with varying degrees of space and time complexities. While these techniques are developed in a standard CPU setting, many of their applications such as click-fraud detection and network-traffic summarization typically execute on special networking architectures called Network Processing Units (NPUs). These NPUs interface with special associative memories known as Ternary Content Addressable Memories (TCAMs) to provide gigabit rate forwarding at network routers. In this paper, we describe how the integrated architecture of NPU and TCAMs can be exploited towards achieving the goal of developing high-speed stream summarization solutions. We propose two TCAM-conscious solutions for the frequent elements problem in data streams and present a comprehensive evaluation of these techniques on a state-of-the-art networking platform.

References

[1]
Intel network processing units. http://www.intel.com/, 2006.
[2]
Teja networking systems. http://www.teja.com, 2006.
[3]
D. J. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal, 12(2):120--139, 2003.
[4]
N. Bandi, D. Agrawal, and A. E. Abbadi. Fast algorithms for heavy distinct hitters using associative memories. In International Conference on Distributed Computing Systems(ICDCS), 2007.
[5]
N. Bandi, S. Schnieder, D. Agrawal, and A. E. Abbadi. Hardware acceleration of database operations using content-addressable memories. In DaMoN, 2005.
[6]
S. Chandrasekaran. Telegraphcq: Continuous dataflow processing for an uncertain world, 2003.
[7]
M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In ICALP'02, pages 693--703, 2002.
[8]
G. Cormode and S. Muthukrishnan. What's hot and what's not: tracking most frequent items dynamically. In PODS '03, pages 296--306, 2003.
[9]
C. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk. Gigascope: a stream database for network applications. In SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pages 647--651, New York, NY, USA, 2003. ACM Press.
[10]
E. D. Demaine, A. López-Ortiz, and J. I. Munro. Frequency estimation of internet packet streams with limited space. In Proceedings of the 10th Annual European Symposium on Algorithms (ESA'02), volume 2461, pages 348--360, 2002.
[11]
C. Estan and G. Varghese. New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice. ACM Trans. Comput. Syst., 21(3):270--313, 2003.
[12]
R. H. K. Fang Yu. Efficient Multi-Match Packet Classification with TCAM. In IEEE Hot Interconnects, 2004.
[13]
B. T. Gold, A. Ailamaki, L. Huston, and B. Falsafi. Accelerating database operations using a network processor. In DaMoN, 2005.
[14]
M. Greenwald and S. Khanna. Space-efficient online computation of quantile summaries. In SIGMOD Conference, 2001.
[15]
J. Hershberger, N. Shrivastava, S. Suri, and C. D. Tóth. Adaptive spatial partitioning for multidimensional data streams. In ISAAC, pages 522--533, 2004.
[16]
R. M. Karp, S. Shenker, and C. H. Papadimitriou. A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Syst., 28(1):51--55, 2003.
[17]
K. Lakshminarayanan, A. Rangarajan, and S. Venkatachary. Algorithms for advanced packet classification with ternary cams. In SIGCOMM '05, pages 193--204, New York, NY, USA, 2005. ACM Press.
[18]
G. S. Manku and R. Motwani. Approximate frequency counts over data streams, 2002.
[19]
G. S. Manku, S. Rajagopalan, and B. G. Lindsay. Approximate medians and other quantiles in one pass and with limited memory. In SIGMOD'98, pages 426--435, 1998.
[20]
A. Metwally, D. Agrawal, and A. E. Abbadi. Efficient computation of frequent and top-k elements in data streams. In ICDT, pages 398--412, 2005.
[21]
J. Misra and D. Gries. Finding repeated elements. Sci. Comput. Program., 2(2):143--152, 1982.
[22]
S. Mysore, B. Agrawal, T. Sherwood, N. Shrivastava, and S. Suri. Profiling over adaptive ranges. In CGO '06: Proceedings of the International Symposium on Code Generation and Optimization, pages 147--158, 2006.
[23]
G. J. Narlikar, A. Basu, and F. Zane. Coolcams: Power-efficient tcams for forwarding engines. In INFOCOM, 2003.
[24]
R. Panigrahy and S. Sharma. Sorting and Searching using Ternary CAMs. In IEEE Micro., 2003.
[25]
J. Widom and R. Motwani. Query processing, resource management, and approximation in a data stream management system, 2003.
[26]
F. Yu, R. H. Katz, and T. V. Lakshman. Gigabit rate packet pattern-matching using tcam. In ICNP '04, pages 174--183, 2004.

Cited By

View all
  • (2024)Learning-Based Sketch for Adaptive and High-Performance Network MeasurementIEEE/ACM Transactions on Networking10.1109/TNET.2024.336417632:3(2571-2585)Online publication date: Jun-2024
  • (2023)Universal and Accurate Sketch for Estimating Heavy Hitters and Moments in Data StreamsIEEE/ACM Transactions on Networking10.1109/TNET.2022.321602531:5(1919-1934)Online publication date: Oct-2023
  • (2023)Enhanced Machine Learning Sketches for Network MeasurementsIEEE Transactions on Computers10.1109/TC.2022.318556072:4(957-970)Online publication date: 1-Apr-2023
  • Show More Cited By

Index Terms

  1. Fast data stream algorithms using associative memories

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
    June 2007
    1210 pages
    ISBN:9781595936868
    DOI:10.1145/1247480
    • General Chairs:
    • Lizhu Zhou,
    • Tok Wang Ling,
    • Program Chair:
    • Beng Chin Ooi
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. TCAMS
    2. data streams
    3. hardware

    Qualifiers

    • Article

    Conference

    SIGMOD/PODS07
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 26 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Learning-Based Sketch for Adaptive and High-Performance Network MeasurementIEEE/ACM Transactions on Networking10.1109/TNET.2024.336417632:3(2571-2585)Online publication date: Jun-2024
    • (2023)Universal and Accurate Sketch for Estimating Heavy Hitters and Moments in Data StreamsIEEE/ACM Transactions on Networking10.1109/TNET.2022.321602531:5(1919-1934)Online publication date: Oct-2023
    • (2023)Enhanced Machine Learning Sketches for Network MeasurementsIEEE Transactions on Computers10.1109/TC.2022.318556072:4(957-970)Online publication date: 1-Apr-2023
    • (2022)Concise Retrieval of Flow Statistics for Software-Defined NetworksIEEE Systems Journal10.1109/JSYST.2021.306530616:1(554-565)Online publication date: Mar-2022
    • (2022)TalentSketch: LSTM-based Sketch for Adaptive and High-Precision Network Measurement2022 IEEE 30th International Conference on Network Protocols (ICNP)10.1109/ICNP55882.2022.9940396(1-12)Online publication date: 30-Oct-2022
    • (2022)LotterySampling: A Randomized Algorithm for the Heavy Hitters and Top-k Problems in Data StreamsComputing and Combinatorics10.1007/978-3-031-22105-7_3(24-35)Online publication date: 22-Oct-2022
    • (2021)Efficient local locking for massively multithreaded in-memory hash-based operatorsThe VLDB Journal10.1007/s00778-020-00642-5Online publication date: 11-Feb-2021
    • (2020)Don't Work on Individual Data Plane Algorithms. Put Them Together!Proceedings of the 19th ACM Workshop on Hot Topics in Networks10.1145/3422604.3425932(60-66)Online publication date: 4-Nov-2020
    • (2020)An Adaptive Network Data Collection System in SDNIEEE Transactions on Cognitive Communications and Networking10.1109/TCCN.2019.29561416:2(562-574)Online publication date: Jun-2020
    • (2020)Universal Online Sketch for Tracking Heavy Hitters and Estimating Moments of Data StreamsIEEE INFOCOM 2020 - IEEE Conference on Computer Communications10.1109/INFOCOM41043.2020.9155454(974-983)Online publication date: Jul-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media