[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

FROD: Fast and Robust Distance-Based Outlier Detection with Active-Inliers-Patterns in Data Streams

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2018 (ICANN 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11139))

Included in the following conference series:

Abstract

The detection of distance-based outliers from streaming data is critical for modern applications ranging from telecommunications to cybersecurity. However, existing works mainly concentrate on improving the responding speed, none of these proposals can perform well in streams with varying data distribution. In this paper, we propose a Fast and Robust Outlier Detection method (FROD in short) to solve this dilemma and achieve the promotion in both detection performance and processing throughput. Specifically, to adapt the changing distribution in data streams, we employ the Active-Inliers-Pattern which dynamically selects reserved objects for further outlier analysis. Moreover, an effective micro-cluster-based data storing structure is proposed to improve the detection efficiency, which is supported by our theoretical analysis on the complexity bounds. Moreover, we present a potential background updating optimization approach to hide the updating time. Experiments performed on real-world and synthetic datasets verify our theoretical study and demonstrate that our algorithm is not only faster than state-of-the-art methods, but also achieve a better detection performance when the outlier rate fluctuates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aggarwal, C.C.: Outlier Analysis. Data Mining, pp. 237–263. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-14142-8_8

    Chapter  Google Scholar 

  2. Angiulli, F., Fassetti, F.: Detecting distance-based outliers in streams of data. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 811–820. ACM (2007)

    Google Scholar 

  3. Cao, L., Yang, D., Wang, Q., Yu, Y., Wang, J., Rundensteiner, E.A.: Scalable distance-based outlier detection over high-volume data streams. In: Data Engineering (ICDE), IEEE 30th International Conference on 2014. pp. 76–87. IEEE (2014)

    Google Scholar 

  4. Huang, H., Kasiviswanathan, S.P.: Streaming anomaly detection using randomized matrix sketching. Proc. VLDB Endowment 9(3), 192–203 (2015)

    Article  Google Scholar 

  5. Kalyan, V., Ignacio, A., Alfredo, C.: AI2: training a big data machine to defend. In: IEEE International Conference on Big Data Security, New York (2016)

    Google Scholar 

  6. Knox, E.M.: Algorithms for mining distance based outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, pp. 392–403. Citeseer (1998)

    Google Scholar 

  7. Kontaki, M., Gounaris, A., Papadopoulos, A.N., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: Data Engineering (ICDE), IEEE 27th International Conference on 2011. pp. 135–146. IEEE (2011)

    Google Scholar 

  8. Tran, L., Fan, L., Shahabi, C.: Distance-based outlier detection in data streams. Proc. VLDB Endowment 9(12), 1089–1100 (2016)

    Article  Google Scholar 

  9. Yang, D., Rundensteiner, E.A., Ward, M.O.: Neighbor-based pattern detection for windows over streaming data. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 529–540. ACM (2009)

    Google Scholar 

  10. Yang, Y., Liu, X.: A re-examination of text categorization methods. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49. ACM (1999)

    Google Scholar 

Download references

Acknowledgement

The authors would like to thank the anonymous reviewers for their valuable comments. This work was supported by the National Key Research and Development Program (Grant No. 2016YFB1000101), the National Natural Science Foundation of China (Grant No. 61379052), the Natural Science Foundation for Distinguished Young Scholars of Hunan Province (Grant No. 14JJ1026), Specialized Research Fund for the Doctoral Program of Higher Education (Grant No.20124307110015).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yijie Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Z., Wang, Y., Zhao, G., Cheng, L., Ma, X. (2018). FROD: Fast and Robust Distance-Based Outlier Detection with Active-Inliers-Patterns in Data Streams. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11139. Springer, Cham. https://doi.org/10.1007/978-3-030-01418-6_62

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01418-6_62

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01417-9

  • Online ISBN: 978-3-030-01418-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics