[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Certain Strategic Study on Machine Learning-Based Graph Anomaly Detection

  • Conference paper
  • First Online:
Mobile Computing and Sustainable Informatics

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 68))

Abstract

“A rotten apple spoils the whole bunch” deciphers the research problem domain. Taking a broader perspective, the existence of anomaly in a graphical community would degrade the global network performance. The anomaly is a hindrance to insight for a better data quality analytics. Though the majority of multidisciplinary contributions in machine learning prevail, a surge towards graph-based learning is gaining a significant importance. Most of the statistical machine learning methodology adopted for outlier identification is inherited in most of the graph-based anomaly detection (GBAD) techniques. This survey aims to establish a broad overview on the state-of-the-art methods for GBAD in a static environment, specifically GBAD techniques by utilizing structural orientation, community-based discovery to seek the quality advancement. To achieve better graph storage on query handling, some graph summary techniques are studied. Intuitive comparative analysis of diverse GBAD algorithm helps to clarify novice researcher in problem solving. Moreover, the survey opens up new research ideas and practical challenges in the realm for a robust futuristic contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 175.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 219.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Jain et al., Big data preprocessing—a survey of existing and latest outlier detection techniques. Int. J. Emerg. Technol. Comput. Sci. Electron. (IJETCSE) 14(2) (2015)

    Google Scholar 

  2. K. Singh et al., Outlier detection: applications and techniques. IJCSI Int. J. Comput. Sci. Issues 9(1) (2012)

    Google Scholar 

  3. J. Zhang, Advancements of outlier detection: a survey. ICST Trans. Scal. Inf. Syst. 13(01) (2013)

    Google Scholar 

  4. A.M.C. Souza et al., An outlier detect algorithm using big data processing and internet of things architecture. Procedia Comput. Sci. 52, 1010–1015 (2015)

    Article  Google Scholar 

  5. X. Xu et al., Recent progress of anomaly detection. Advances in architectures, big data, and machine learning techniques for complex internet of things systems (2019)

    Google Scholar 

  6. A. Rajaram, S. Palaniswami, The modified security scheme for data integrity in MANET. Int. J. Eng. Sci. Technol. 2(7), 3111–3119 (2010)

    Google Scholar 

  7. Y. Susanti et al., M estimation, S estimation, and MM estimation in robust regression. Int. J. Pure Appl. Math. IJPAM 91(3) (2014)

    Google Scholar 

  8. S. Dray et al., Principal component analysis with missing values: a comparative survey of methods. Plant Ecol. 216, 657–667 (2015)

    Article  Google Scholar 

  9. A. McCallum, K. Nigam, L.H. Ungar, Efficient clustering of high-dimensional data sets with application to reference matching, in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; KDD’00 (ACM, New York, NY, USA, 2000), pp. 169–178

    Google Scholar 

  10. Z. Abu Bakar et al., A comparative study for outlier detection techniques in data mining, in CIS (IEEE, 2006)

    Google Scholar 

  11. H.-P. Kriegel et al., Outlier detection techniques, in 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2010)

    Google Scholar 

  12. S. Loisel, Y. Takane, Comparisons among several methods for handling missing data in principal component analysis (PCA). Adv. Data Anal. Classif. (2018)

    Google Scholar 

  13. D. Chen, P. Morin, U. Wagner, Absolute approximation of Tukey depth: theory and experiments. Comput. Geom. 46(5), 566–573 (2013)

    Article  MathSciNet  Google Scholar 

  14. Koufakou et al., Scalable and efficient outlier detection strategy for categorical data, in 19th IEEE International Conference on Tools with Artificial Intelligence (2007)

    Google Scholar 

  15. A. Koufakov et al., Fast parallel outlier detection for categorical dataset using map reduce, in IEEE International Joint conference on Neural Networks (2008), pp. 3297–3303

    Google Scholar 

  16. V.J. Hodge, J. Austin, A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)

    Article  Google Scholar 

  17. V. Chandola et al., Anomaly detection: a survey. ACM Comput. Surv. 09, 1–72 (2009)

    Article  Google Scholar 

  18. A. Patcha, J.-M. Park, An overview of anomaly detection techniques: existing solutions and latest technological trends. Comput. Netw. 51(12), 3448–3470 (2007). https://doi.org/10.1016/j.comnet.2007.02.001

    Article  Google Scholar 

  19. W. Eberle, L. Holder, Anomaly detection in data represented as graphs. Intell. Data Anal. 11(6), 663–689 (2007). https://doi.org/10.3233/ida-2007-11606

  20. P.N. Tan et al., Introduction to Data Mining (Pearson Addison Wesley, Boston, 2005)

    Google Scholar 

  21. L. Wilkinson, Visualizing big data outliers through distributed aggregation. IEEE Trans. Visual. Comput. Graph. 24(1) (2018)

    Google Scholar 

  22. S. Agrawal, A. Patel, A study on graph storage database of NOSQL. Int. J. Soft Comput. Artif. Intell. Appl. (IJSCAI) 5(1) (2016)

    Google Scholar 

  23. Q. Qian et al., An anomaly intrusion detection method based on PageRank algorithm, in IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing (2013), pp. 2226–2230

    Google Scholar 

  24. R. West et al., Mining missing hyperlinks from human navigation traces: a case study of Wikipedia, in ACM International World Wide Web Conference Committee (2015)

    Google Scholar 

  25. A. David et al. Reversible Markov chains and random walks on graphs. Unfinished monograph (2002)

    Google Scholar 

  26. S. Vempala, Geometric random walks: a survey, in Combinatorial and Computational Geometry, vol. 52 (MSRI Publications, 2005), pp. 573–612

    Google Scholar 

  27. Z. Yao et al., Anomaly detection using proximity graph and PageRank algorithm. IEEE Trans. Inform. Forensics Secur. 7(4) (2012)

    Google Scholar 

  28. H.D.K. Moonesinghe et al., OutRank: a graph-based outlier detection framework using random walk. Int. J. Artif. Intell. Tools 17(1) (2008)

    Google Scholar 

  29. P.I. Sánchez, E. Müller, O. Irmler, K. Böhm, Local context selection for outlier ranking in graphs with multiple numeric node attributes, in SSDBM (2014)

    Google Scholar 

  30. D. Sensarma et al., A survey on different graph based anomaly detection techniques. Indian J. Sci. Technol. 8(31) (2015)

    Google Scholar 

  31. M. Davis et al., Detecting anomalies in graphs with numeric labels, in ACM CIKM’11, 24–28 Oct 2011

    Google Scholar 

  32. E. Muller, P.I. Sanchez, Y. Mulle, K. Bohm, Ranking outlier nodes in subspaces of attributed graphs, in IEEE 29th International Conference on Data Engineering Workshops (ICDEW) (2013)

    Google Scholar 

  33. B. Perozzi, L. Akoglu, P. Iglesias Sánchez, E. Müller, Focused clustering and outlier detection in large attributed graphs, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD’14 (2014)

    Google Scholar 

  34. K.S. Kannan, K. Manoj, Outlier detection in multivariate data. Appl. Math. Sci. 9(47), 2317–2324 (2015)

    Google Scholar 

  35. V. Bhatia, B. Saneja, R. Rani, INGC: graph clustering & outlier detection algorithm using label propagation, in International Conference on Machine Learning and Data Science (2017)

    Google Scholar 

  36. D. Kagan, Y. Elovichi, M. Fire, Generic anomalous vertices detection utilizing a link prediction algorithm. Soc. Netw. Anal. Min. 8(1) (2018)

    Google Scholar 

  37. S.E. Schaeffer, Graph clustering: survey. Comput. Sci. Rev. 1, 27–64 (2007)

    Google Scholar 

  38. Z. Chen, Community-based anomaly detection in evolutionary networks. J. Intell. Inf. Syst. Springer Science+Business Media (2011)

    Google Scholar 

  39. R. Jessica et al., A bio-inspired algorithm for searching relationships in social networks, in Proceedings of the 2011 International Conference on Computational Aspects of Social Networks (2011)

    Google Scholar 

  40. J. Yang, J. McAuley, J. Leskovec, Community detection in networks with node attributes, in 2013 IEEE 13th International Conference on Data Mining (2013)

    Google Scholar 

  41. M. Wang et al., Community detection in social networks: an in-depth benchmarking study with a procedure-oriented framework, in Proceedings of the VLDB Endowment, vol. 8, no. 10 (2015)

    Google Scholar 

  42. B. Perozzi, L. Akoglu, Scalable anomaly ranking of attributed neighborhoods, in Proceedings of the 2016 SIAM International Conference on Data Mining (2016)

    Google Scholar 

  43. L. Akoglu, M. McGlohon, C. Faloutsos, Oddball: spotting anomalies in weighted graphs, in Lecture Notes in Computer Science (2010), pp. 410–421

    Google Scholar 

  44. M. Rosvall, Different approaches to community detection. Extended version of the many facets of community detection in complex networks. Appl. Netw. Sci. 2, 4 (2017). arXiv:1712.06468v1

  45. G. Rossetti, R. Guidotti, I. Miliou, D. Pedreschi, F. Giannotti, A supervised approach for intra-/inter-community interaction prediction in dynamic social networks. Soc. Netw. Anal. Min. 6(1) (2016)

    Google Scholar 

  46. M. Sachan, D. Contractor, T.A. Faruquie, L.V. Subramaniam, Using content and interactions for discovering communities in social networks, in Proceedings of the 21st International Conference on World Wide Web (2012)

    Google Scholar 

  47. S. Kumar et al., Community interaction and conflict on the web, in WWW 2018: The 2018 Web Conference, 23–27 Apr 2018

    Google Scholar 

  48. S. Pandhre et al., Community-based outlier detection for edge-attributed graphs. arXiv: 1612.09435v2 [cs.SI] (2017)

    Google Scholar 

  49. Z. Peng, M. Luo, J. Li, H. Liu, Q. Zheng, Anomalous: a joint modeling approach for anomaly detection on attributed networks, in International Joint Conference on Artificial Intelligence (2018), pp. 3513–3519

    Google Scholar 

  50. J. Li, H. Dani, X. Hu, H. Liu, Radar: residual analysis for anomaly detection in attributed networks, in IJCAI (2017)

    Google Scholar 

  51. C. Noble, D. Cook, Graph-based anomaly detection, in ACM SIGKDD, 24–27 Aug 2003

    Google Scholar 

  52. D. Batjargal et al., StarZIP: streaming graph compression technique for data archiving. IEEE Access 1 (2019)

    Google Scholar 

  53. A. Chavan, An introduction to graph compression techniques for in-memory graph computation (2015)

    Google Scholar 

  54. J. Shun, L. Dhulipala, Smaller and faster: parallel processing of compressed graphs with Ligra+ (2015), pp. 403–412

    Google Scholar 

  55. O. Goonetilleke, D. Koutra, T. Sellis, K. Liao, Edge labeling schemes for graph data, in Proceedings of the 29th International Conference on Scientific and Statistical Database Management (SSDBM’17) (United States of America: Association for Computing Machinery, 2017), pp. 1–12

    Google Scholar 

  56. J. Cheng, S. Huang, H. Wu, A. Fu, TF-label: a topological-folding labeling scheme for reachability querying in a large graph, in Proceedings of the ACM SIGMOD International Conference on Management of Data (2013), pp. 193–204

    Google Scholar 

  57. F. Verdoja, M. Grangetto, Graph Laplacian for image anomaly detection. Mach. Vis. Appl. 31, 11 (2020)

    Article  Google Scholar 

  58. K.U. Khan et al., An efficient algorithm for MDL based graph summarization for dense graphs. Contemp. Eng. Sci. 7(16), 791–796 (2014)

    Google Scholar 

  59. D. Koutra, U. Kang, J. Vreeken, C. Faloutsos, Summarizing and understanding large graphs. Stat. Anal. Data Min. ASA Data Sci. J. 8(3), 183–202 (2015)

    Article  MathSciNet  Google Scholar 

  60. S. Velampalli et al., Novel graph based anomaly detection using background knowledge, in Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference (2017)

    Google Scholar 

  61. M. Salehi, L. Rashidi, A survey on anomaly detection in evolving data. ACM SIGKDD Explor. Newsl. 20(1), 13–23 (2018)

    Article  Google Scholar 

  62. E. Geepalla, N. Abuhamoud, A. Abouda, Analysis of call detail records for understanding users behavior and anomaly detection using Neo4j, in 5th International Symposium on Data Mining Applications (2018), pp. 74–83

    Google Scholar 

  63. P.I. Gionis, R. Motwani, Similarity search in high dimensions via hashing, in Proceedings of the 25th International Conference on Very Large Data Bases, VLDB’99 (Morgan Kaufmann Publishers Inc., 1999), pp. 518–529

    Google Scholar 

  64. Q. Cheng, Y. Zhou, Y. Feng et al., An unsupervised ensemble framework for node anomaly behavior detection in social network. Soft Comput. (2019)

    Google Scholar 

  65. M. Deepa, M. Rajalakshmi, Survey of deep and extreme learning machines for big data classification. Asian J. Res. Soc. Sci. Humanit. Asian Res. Consortium 6(8), 2502–2512 (2016)

    Google Scholar 

  66. F. Angiulli, C. Pizzuti, Fast Outlier Detection in High Dimensional Spaces, in Springer PKDD. LNAI, vol. 2431 (2002), pp. 15–27

    Google Scholar 

  67. F. Angiulli, C. Pizzuti, Outlier mining in large high-dimensional data sets. IEEE Trans. Knowl. Data Eng. 17(2) (2005)

    Google Scholar 

  68. R.L. Graham, An efficient algorithm for determining the convex hull of a finite planar set. Inf. Process. Lett. 1(4), 132–133 (1972)

    Article  Google Scholar 

  69. L. Grandinetti et al., High-performance computing and big data analysis. Commun. Comput. Inf. Sci. (2019)

    Google Scholar 

  70. H.V. Nguyen, V. Gopalkrishnan, Feature extraction for outlier detection in high-dimensional spaces, in Proceedings of the Fourth International Workshop on Feature Selection in Data Mining. PMLR 10, 66–75 (2010)

    Google Scholar 

  71. J. Gao, F. Liang, W. Fan, C. Wang, Y. Sun, J. Han, On community outliers and their efficient detection in information networks, in KDD (2010), pp. 813–822

    Google Scholar 

  72. R.A. Jarvis, On the identification of the convex hull of a finite set of points in the plane. Inf. Process. Lett. 2, 18–21 (1973)

    Article  Google Scholar 

  73. J.M. Kleinberg et al., Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)

    Google Scholar 

  74. K. Senthamarai Kannan et al., Labeling methods for identifying outliers. Int. J. Stat. Syst. 10(2), 231–238 (2015). ISSN 0973-2675

    Google Scholar 

  75. T. Kohonen, Self-organization and associative memory, in Springer Series in Information Sciences (1988)

    Google Scholar 

  76. K. Sugihara, Robust gift wrapping for the three-dimensional convex hull. J. Comput. Syst. Sci. 49, 391–407 (1994)

    Article  MathSciNet  Google Scholar 

  77. L. Xu et al., A hierarchical framework using approximated local outlier factor for efficient anomaly detection. Procedia Comput. Sci. 19, 1174–1181 (2013)

    Article  Google Scholar 

  78. M.M. Breunig, H.-P. Kriegel, R. Ng, J. Sander, LOF: identifying density-based local outliers, in SIGMOD’00 (2000), pp. 427–438

    Google Scholar 

  79. S. Maya, K. Ueno, T. Nishikawa, dLSTM: a new approach for anomaly detection using deep learning with delayed prediction. Int. J. Data Sci. Anal. (2019)

    Google Scholar 

  80. N. Billor et al., BACON: blocked adaptive computationally efficient outlier nominators. Comput. Stat. Data Anal. 34, 279–298 (2000)

    Article  Google Scholar 

  81. P. Filzmoser et al., Outlier identification in high dimensions. Preprint submitted to Elsevier Science (2006)

    Google Scholar 

  82. P. Cao et al., A focal any-angle path-finding algorithm based on A* on visibility graphs. arXiv preprint arXiv:1706.03144 (2017)

  83. Qiu et al., A survey of machine learning for big data processing. EURASIP J. Adv. Signal Process. 2016, 67 (2016)

    Google Scholar 

  84. R.E. Kalman, A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 183, 35–45 (1960)

    Article  MathSciNet  Google Scholar 

  85. S. Brin et al., The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)

    Article  Google Scholar 

  86. S. Cateni et al., Outlier detection methods for industrial applications, in Advances in Robotics, Automation and Control (2008), p. 472

    Google Scholar 

  87. Q. Tan, N. Liu, X. Hu, Deep representation learning for social network analysis. Front. Big Data 2, 2 (2019)

    Article  Google Scholar 

  88. T.M. Chan, Optimal output-sensitive convex hull algorithms in two and three dimensions. Discrete Comput. Geom. 16, 361–368 (1996)

    Google Scholar 

  89. Z. Liu, X. Liu, J. Ma, H. Gao, An optimized computational framework for isolation forest. Math. Probl. Eng. (2018)

    Google Scholar 

  90. Z. He et al., Discovering cluster base local outliers. Patten Recogn. Lett. 24(9–10), 1641–1650 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Saranya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saranya, S., Rajalakshmi, M. (2022). Certain Strategic Study on Machine Learning-Based Graph Anomaly Detection. In: Shakya, S., Bestak, R., Palanisamy, R., Kamel, K.A. (eds) Mobile Computing and Sustainable Informatics. Lecture Notes on Data Engineering and Communications Technologies, vol 68. Springer, Singapore. https://doi.org/10.1007/978-981-16-1866-6_5

Download citation

Publish with us

Policies and ethics