Abstract
In several applications, when anomalies are detected, human experts have to investigate or verify them one by one. As they investigate, they unwittingly produce a label - true positive (TP) or false positive (FP). In this paper, we propose a method (called OMD-Clustering) that exploits this label feedback to minimize the FP rate and detect more relevant anomalies, while minimizing the expert effort required to investigate them. The OMD-Clustering method iteratively suggests the top-1 anomalous instance to a human expert and receives feedback. Before suggesting the next anomaly, the method re-ranks instances so that the top anomalous instances are similar to the TP instances and dissimilar to the FP instances. This is achieved by learning to score anomalies differently in various regions of the feature space. An experimental evaluation on several real-world datasets is conducted. The results show that OMD-Clustering achieves significant improvement in both detection precision and expert effort compared to state-of-the-art interactive anomaly detection methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Pevný, T.: Loda: lightweight on-line detector of anomalies. Mach. Learn. 102(2), 275–304 (2015). https://doi.org/10.1007/s10994-015-5521-0
Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. IEEE (2008)
Bouguelia, M.-R., Nowaczyk, S., Santosh, K.C., Verikas, A.: Agreeing to disagree: active learning with noisy labels without crowdsourcing. Int. J. Mach. Learn. Cybern. 9(8), 1307–1319 (2018). https://doi.org/10.1007/s13042-017-0645-0
Bouguelia, M.-R., Belaid, Y., Belaid, A.: An adaptive streaming active learning strategy based on instance weighting. Pattern Recogn. Lett. 70, 38–44 (2016)
Settles, B.: Active Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 6. Morgan & Claypool, San Rafael (2012)
Görnitz, N., Kloft, M., Rieck, K., Brefeld, U.: Toward supervised anomaly detection. J. Artif. Intell. Res. 46, 235–262 (2013)
Nissim, N., et al.: ALPD: active learning framework for enhancing the detection of malicious pdf files. In: 2014 IEEE Joint Intelligence and Security Informatics Conference, pp. 91–98. IEEE (2014)
Pelleg, D., Moore, A.W.: Active learning for anomaly and rare-category detection. In: Advances in Neural Information Processing Systems, pp. 1073–1080 (2005)
Ghani, R., Kumar, M.: Interactive learning for efficiently detecting errors in insurance claims. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 325–333 (2011)
Das, S., Wong, W.-K., Dietterich, T., Fern, A., Emmott, A.: Incorporating expert feedback into active anomaly discovery. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 853–858. IEEE (2016)
Das, S., Doppa, J.R.: GLAD: GLocalized anomaly detection via active feature space suppression. arXiv preprint arXiv:1810.01403 (2018)
Siddiqui, M.A., Fern, A., Dietterich, T.G., Wright, R., Theriault, A., Archer, D.W.: Feedback-guided anomaly discovery via online optimization. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2200–2209 (2018)
Lamba, H., Akoglu, L.: Learning on-the-job to re-rank anomalies from top-1 feedback. In: Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 612–620. Society for Industrial and Applied Mathematics (2019)
Rayana, S.: ODDS library, Stony Brook University, Department of Computer Sciences (2016). http://odds.cs.stonybrook.edu
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Cheng, L., Sundaresh, S., Bouguelia, MR., Dikmen, O. (2020). Interactive Anomaly Detection Based on Clustering and Online Mirror Descent. In: Gama, J., et al. IoT Streams for Data-Driven Predictive Maintenance and IoT, Edge, and Mobile for Embedded Machine Learning. ITEM IoT Streams 2020 2020. Communications in Computer and Information Science, vol 1325. Springer, Cham. https://doi.org/10.1007/978-3-030-66770-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-66770-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66769-6
Online ISBN: 978-3-030-66770-2
eBook Packages: Computer ScienceComputer Science (R0)