[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Human-machine interactive streaming anomaly detection by online self-adaptive forest

Published: 08 August 2022 Publication History

Abstract

Anomaly detectors are used to distinguish differences between normal and abnormal data, which are usually implemented by evaluating and ranking the anomaly scores of each instance. A static unsupervised streaming anomaly detector is difficult to dynamically adjust anomaly score calculation. In real scenarios, anomaly detection often needs to be regulated by human feedback, which benefits adjusting anomaly detectors. In this paper, we propose a human-machine interactive streaming anomaly detection method, named ISPForest, which can be adaptively updated online under the guidance of human feedback. In particular, the feedback will be used to adjust the anomaly score calculation and structure of the detector, ideally attaining more accurate anomaly scores in the future. Our main contribution is to improve the tree-based streaming anomaly detection model that can be updated online from perspectives of anomaly score calculation and model structure. Our approach is instantiated for the powerful class of tree-based streaming anomaly detectors, and we conduct experiments on a range of benchmark datasets. The results demonstrate that the utility of incorporating feedback can improve the performance of anomaly detectors with a few human efforts.

References

[1]
Hawkins D M Identification of Outliers 1980 London Chapman and Hall
[2]
Aggarwal C C Aggarwal C C Outlier analysis Data Mining 2015 Cham Springer 237-263
[3]
Fiore U, De Santis A, Perla F, Zanetti P, and Palmieri F Using generative adversarial networks for improving classification effectiveness in credit card fraud detection Information Sciences 2019 479 448-455
[4]
Tseng V S, Ying J C, Huang C W, Kao Y, Chen K T. FrauDetector: a graph-mining-based framework for fraudulent phone call detection. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 2157–2166
[5]
Liu F T, Ting K M, Zhou Z H. Isolation forest. In: Proceedings of the 8th IEEE International Conference on Data Mining. 2008, 413–422
[6]
Yang X, Latecki L J, Pokrajac D. Outlier detection with globally optimal exemplar-based GMM. In: Proceedings of 2009 SIAM International Conference on Data Mining. 2009, 145–154
[7]
Zong B, Song Q, Min M R, Cheng W, Lumezanu C, Cho D K, Chen H F. Deep autoencoding Gaussian mixture model for unsupervised anomaly detection. In: Proceedings of the 6th International Conference on Learning Representations. 2018
[8]
Manzoor E, Milajerdi S M, Akoglu L. Fast memory-efficient anomaly detection in streaming heterogeneous graphs. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1035–1044
[9]
Paulheim H and Meusel R A decomposition of the outlier detection problem into a set of supervised learning problems Machine Learning 2015 100 2 509-531
[10]
Overby D, Wall J, Keyser J. Interactive analysis of situational awareness metrics. In: Proceedings of SPIE 8294 Visualization and Data Analysis 2012. 2012, 829406
[11]
Cao N, Shi C, Lin S, Lu J, Lin Y R, and Lin C Y TargetVue: visual analysis of anomalous user behaviors in online communication systems IEEE Transactions on Visualization and Computer Graphics 2016 22 1 280-289
[12]
Tan S C, Ting K M, Liu T F. Fast anomaly detection for streaming data. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 1511–1516
[13]
Wu K, Zhang K, Fan W, Edwards A, Yu P S. RS-Forest: a rapid density estimator for streaming anomaly detection. In: Proceedings of 2014 IEEE International Conference on Data Mining. 2014, 600–609
[14]
Pevný T Loda: lightweight on-line detector of anomalies Machine Learning 2016 102 2 275-304
[15]
Erfani S M, Rajasegarar S, Karunasekera S, and Leckie C High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning Pattern Recognition 2016 58 121-134
[16]
Zhang K, Hutter M, Jin H. A new local distance-based outlier detection approach for scattered real-world data. In: Proceedings of the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2009, 813–822
[17]
Guha S, Mishra N, Roy G, Schrijvers O. Robust random cut forest based anomaly detection on streams. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning. 2016, 2712–2721
[18]
Mu X, Ting K M, and Zhou Z H Classification under streaming emerging new classes: a solution using completely-random trees IEEE Transactions on Knowledge and Data Engineering 2017 29 8 1605-1618
[19]
Gomes H M, Bifet A, Read J, Barddal J P, Enembreck F, Pfharinger B, Holmes G, and Abdessalem T Adaptive random forests for evolving data stream classification Machine Learning 2017 106 9–10 1469-1495
[20]
Ahmad S, Lavin A, Purdy S, and Agha Z Unsupervised real-time anomaly detection for streaming data Neurocomputing 2017 262 134-147
[21]
Malhotra P, Vig L, Shroff G, Agarwal P. Long short term memory networks for anomaly detection in time series. In: Proceedings of the 23rd European Symposium on Artificial Neural Networks. 2015, 89–94
[22]
Qiu J, Du Q, and Qian C KPI-TSAD: a time-series anomaly detector for KPI monitoring in cloud applications Symmetry 2019 11 11 1350
[23]
Munir M, Siddiqui S A, Dengel A, and Ahmed S DeepAnT: a deep learning approach for unsupervised anomaly detection in time series IEEE Access 2018 7 1991-2005
[24]
Dong Y and Japkowicz N Threaded ensembles of autoencoders for stream learning Computational Intelligence 2018 34 1 261-281
[25]
Veeramachaneni K, Arnaldo I, Korrapati V, Bassias C, Li K. AI2: training a big data machine to defend. In: Proceedings of the 2nd IEEE International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS). 2016, 49–54
[26]
Das S, Wong W K, Fern A, Dietterich T G, Siddiqui M A. Incorporating feedback into tree-based anomaly detection. 2017, arXiv preprint arXiv: 1708.09441
[27]
Das S, Wong W K, Dietterich T, Fern A, Emmott A. Incorporating expert feedback into active anomaly discovery. In: Proceedings of the 16th IEEE International Conference on Data Mining (ICDM). 2016, 853–858
[28]
Ting K M, Zhou G T, Liu F T, Tan J S C. Mass estimation and its applications. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2010, 989–998
[29]
Welford B P Note on a method for calculating corrected sums of squares and products Technometrics 1962 4 3 419-420
[30]
Bhatia S, Jain A, Li P, Kumar R, Hooi B. MStream: fast anomaly detection in multi-aspect streams. In: Proceedings of the Web Conference 2021. 2021, 3371–3382
[31]
Hand D J and Till R J A simple generalisation of the area under the ROC curve for multiple class classification problems Machine Learning 2001 45 2 171-186
[32]
Schölkopf B, Williamson R C, Smola A J, Shawe-Taylor J, Platt J C. Support vector method for novelty detection. In: Proceedings of the 12th International Conference on Neural Information Processing Systems. 1999, 582–588
[33]
Breunig M M, Kriegel H P, Ng R T, Sander J. LOF: identifying density-based local outliers. In: Proceedings of 2000 ACM SIGMOD International Conference on Management of Data. 2000, 93–104

Cited By

View all
  • (2025)ACbot: an IIoT platform for industrial robotsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-3449-x19:4Online publication date: 1-Apr-2025
  • (2024)Learning with Asynchronous LabelsACM Transactions on Knowledge Discovery from Data10.1145/366218618:8(1-27)Online publication date: 3-May-2024
  • (2023)An Outlier Detection Algorithm Based on Probability Density ClusteringInternational Journal of Data Warehousing and Mining10.4018/IJDWM.33390119:1(1-20)Online publication date: 21-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Frontiers of Computer Science: Selected Publications from Chinese Universities
Frontiers of Computer Science: Selected Publications from Chinese Universities  Volume 17, Issue 2
Apr 2023
235 pages
ISSN:2095-2228
EISSN:2095-2236
Issue’s Table of Contents

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 08 August 2022
Accepted: 17 February 2022
Received: 20 May 2021

Author Tags

  1. anomaly detection
  2. human-machine interaction
  3. human feedback
  4. random space tree
  5. ensemble method

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)ACbot: an IIoT platform for industrial robotsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-3449-x19:4Online publication date: 1-Apr-2025
  • (2024)Learning with Asynchronous LabelsACM Transactions on Knowledge Discovery from Data10.1145/366218618:8(1-27)Online publication date: 3-May-2024
  • (2023)An Outlier Detection Algorithm Based on Probability Density ClusteringInternational Journal of Data Warehousing and Mining10.4018/IJDWM.33390119:1(1-20)Online publication date: 21-Nov-2023
  • (2023)GUFAD: A Graph-based Unsupervised Fraud Account Detection FrameworkProceedings of the 2023 4th International Conference on Machine Learning and Computer Application10.1145/3650215.3650286(401-406)Online publication date: 27-Oct-2023

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media