[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

n-ary Isolation Forest: An Experimental Comparative Analysis

  • Conference paper
  • First Online:
Artificial Intelligence and Soft Computing (ICAISC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12416))

Included in the following conference series:

  • 1014 Accesses

Abstract

One of the most challenging problems of modern data mining and Computational Intelligence society has been the task of anomaly detection in large datasets, particularly containing mixed data, namely categorical, spatial, or spatio-temporal. In this study, we discuss various versions of the well-known Isolation Forest method as a efficient tool for finding outliers or anomalies. The versions are based on binary, ternary, etc. search trees. Traditional Isolation Forest is based on searching binary search trees. We build and investigate n-ary search trees and analyze their efficiency in the context of anomaly detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 15–27. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45681-3_2

    Chapter  Google Scholar 

  2. Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. arXiv:1901.03407

  3. de la Hoz, E., de la Hoz, E., Ortiz, A., Ortega, J., Martínez-Álvarez, A.: Feature selection by multi-objective optimisation: application to network anomaly detection by hierarchical self-organising maps. Knowl.-Based Syst. 71, 322–338 (2014)

    Google Scholar 

  4. D’Urso, P., Massari, R.: Fuzzy clustering of mixed data. Inf. Sci. 505, 513–534 (2019)

    Article  MathSciNet  Google Scholar 

  5. Erfani, S.M., Rajasegarar, S., Karunasekera, S., Leckie, C.: High-dimensional and large-scale anomaly detection using a linearone-class SVM with deep learning. Pattern Recogn. 58, 121–134 (2016)

    Article  Google Scholar 

  6. Flajolet, P., Odlyzko, A.: The average height of binary trees and other simple trees. J. Comput. Syst. Sci. 25(2), 171–213 (1982)

    Article  MathSciNet  Google Scholar 

  7. Habeeb, R.A.A., Nasaruddin, F., Gani, A., Hashem, I.A.T., Ahmed, E., Imran, M.: Real-time big data processing for anomaly detection: a survey. Int. J. Inf. Manag. 45, 289–307 (2019)

    Article  Google Scholar 

  8. Izakian, H., Pedrycz, W.: Anomaly detection in time series data using a fuzzy c-means clustering. In: 2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS), Edmonton, AB, pp. 1513–1518 (2013)

    Google Scholar 

  9. Izakian, H., Pedrycz, W., Jamal, I.: Clustering spatiotemporal data: an augmented fuzzy c-means. IEEE Trans. Fuzzy Syst. 21(5), 855–868 (2013)

    Article  Google Scholar 

  10. Izakian, H., Pedrycz, W.: Anomaly detection and characterization in spatial time series data: a cluster-centric approach. IEEE Trans. Fuzzy Syst. 22(6), 1612–1624 (2014)

    Article  Google Scholar 

  11. Karczmarek, P., Kiersztyn, A., Pedrycz, W., Al, E.: K-means-based isolation forest. Knowl.-Based Syst. 195, 105659 (2020)

    Google Scholar 

  12. Knorr, E.B., Ng, R.T., Tucakov, V., et al.: Distance-based outliers: algorithms and applications. VLDB Int. J. Very Large Data Bases 8(3–4), 237–253 (2000)

    Article  Google Scholar 

  13. Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008)

    Google Scholar 

  14. Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data (TKDD) 6(1) (2012). article no. 3

    Google Scholar 

  15. Liu, J., Tian, J., Cai, Z., Zhou, Y., Luo, R., Wang, R.: A hybrid semi-supervised approach for financial fraud detection. In: 2017 International Conference on Machine Learning and Cybernetics (ICMLC), Ningbo, pp. 217–222 (2017)

    Google Scholar 

  16. Malhotra, P., Vig, L., Shroff, G., Agarwal, P.: Long short term memory networks for anomaly detection in time series. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 89–94 (2015)

    Google Scholar 

  17. Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 427–438 (2000)

    Google Scholar 

  18. Schlegl, T., Seeb̈ock, P., Waldstein, S.M., Schmidt-Erfurth, U., Langs, G.: Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: IPMI 2017: Information Processing in Medical Imaging, pp. 146–157 (2017)

    Google Scholar 

  19. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  Google Scholar 

  20. Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: KDD 2017 Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, pp. 665–674 (2017)

    Google Scholar 

Download references

Acknowledgements

Funded by the National Science Centre, Poland under CHIST-ERA programme (Grant no. 2018/28/Z/ST6/00563).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paweł Karczmarek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Karczmarek, P., Kiersztyn, A., Pedrycz, W. (2020). n-ary Isolation Forest: An Experimental Comparative Analysis. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2020. Lecture Notes in Computer Science(), vol 12416. Springer, Cham. https://doi.org/10.1007/978-3-030-61534-5_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61534-5_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61533-8

  • Online ISBN: 978-3-030-61534-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics