Abstract
“A rotten apple spoils the whole bunch” deciphers the research problem domain. Taking a broader perspective, the existence of anomaly in a graphical community would degrade the global network performance. The anomaly is a hindrance to insight for a better data quality analytics. Though the majority of multidisciplinary contributions in machine learning prevail, a surge towards graph-based learning is gaining a significant importance. Most of the statistical machine learning methodology adopted for outlier identification is inherited in most of the graph-based anomaly detection (GBAD) techniques. This survey aims to establish a broad overview on the state-of-the-art methods for GBAD in a static environment, specifically GBAD techniques by utilizing structural orientation, community-based discovery to seek the quality advancement. To achieve better graph storage on query handling, some graph summary techniques are studied. Intuitive comparative analysis of diverse GBAD algorithm helps to clarify novice researcher in problem solving. Moreover, the survey opens up new research ideas and practical challenges in the realm for a robust futuristic contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
A. Jain et al., Big data preprocessing—a survey of existing and latest outlier detection techniques. Int. J. Emerg. Technol. Comput. Sci. Electron. (IJETCSE) 14(2) (2015)
K. Singh et al., Outlier detection: applications and techniques. IJCSI Int. J. Comput. Sci. Issues 9(1) (2012)
J. Zhang, Advancements of outlier detection: a survey. ICST Trans. Scal. Inf. Syst. 13(01) (2013)
A.M.C. Souza et al., An outlier detect algorithm using big data processing and internet of things architecture. Procedia Comput. Sci. 52, 1010–1015 (2015)
X. Xu et al., Recent progress of anomaly detection. Advances in architectures, big data, and machine learning techniques for complex internet of things systems (2019)
A. Rajaram, S. Palaniswami, The modified security scheme for data integrity in MANET. Int. J. Eng. Sci. Technol. 2(7), 3111–3119 (2010)
Y. Susanti et al., M estimation, S estimation, and MM estimation in robust regression. Int. J. Pure Appl. Math. IJPAM 91(3) (2014)
S. Dray et al., Principal component analysis with missing values: a comparative survey of methods. Plant Ecol. 216, 657–667 (2015)
A. McCallum, K. Nigam, L.H. Ungar, Efficient clustering of high-dimensional data sets with application to reference matching, in Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; KDD’00 (ACM, New York, NY, USA, 2000), pp. 169–178
Z. Abu Bakar et al., A comparative study for outlier detection techniques in data mining, in CIS (IEEE, 2006)
H.-P. Kriegel et al., Outlier detection techniques, in 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2010)
S. Loisel, Y. Takane, Comparisons among several methods for handling missing data in principal component analysis (PCA). Adv. Data Anal. Classif. (2018)
D. Chen, P. Morin, U. Wagner, Absolute approximation of Tukey depth: theory and experiments. Comput. Geom. 46(5), 566–573 (2013)
Koufakou et al., Scalable and efficient outlier detection strategy for categorical data, in 19th IEEE International Conference on Tools with Artificial Intelligence (2007)
A. Koufakov et al., Fast parallel outlier detection for categorical dataset using map reduce, in IEEE International Joint conference on Neural Networks (2008), pp. 3297–3303
V.J. Hodge, J. Austin, A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)
V. Chandola et al., Anomaly detection: a survey. ACM Comput. Surv. 09, 1–72 (2009)
A. Patcha, J.-M. Park, An overview of anomaly detection techniques: existing solutions and latest technological trends. Comput. Netw. 51(12), 3448–3470 (2007). https://doi.org/10.1016/j.comnet.2007.02.001
W. Eberle, L. Holder, Anomaly detection in data represented as graphs. Intell. Data Anal. 11(6), 663–689 (2007). https://doi.org/10.3233/ida-2007-11606
P.N. Tan et al., Introduction to Data Mining (Pearson Addison Wesley, Boston, 2005)
L. Wilkinson, Visualizing big data outliers through distributed aggregation. IEEE Trans. Visual. Comput. Graph. 24(1) (2018)
S. Agrawal, A. Patel, A study on graph storage database of NOSQL. Int. J. Soft Comput. Artif. Intell. Appl. (IJSCAI) 5(1) (2016)
Q. Qian et al., An anomaly intrusion detection method based on PageRank algorithm, in IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing (2013), pp. 2226–2230
R. West et al., Mining missing hyperlinks from human navigation traces: a case study of Wikipedia, in ACM International World Wide Web Conference Committee (2015)
A. David et al. Reversible Markov chains and random walks on graphs. Unfinished monograph (2002)
S. Vempala, Geometric random walks: a survey, in Combinatorial and Computational Geometry, vol. 52 (MSRI Publications, 2005), pp. 573–612
Z. Yao et al., Anomaly detection using proximity graph and PageRank algorithm. IEEE Trans. Inform. Forensics Secur. 7(4) (2012)
H.D.K. Moonesinghe et al., OutRank: a graph-based outlier detection framework using random walk. Int. J. Artif. Intell. Tools 17(1) (2008)
P.I. Sánchez, E. Müller, O. Irmler, K. Böhm, Local context selection for outlier ranking in graphs with multiple numeric node attributes, in SSDBM (2014)
D. Sensarma et al., A survey on different graph based anomaly detection techniques. Indian J. Sci. Technol. 8(31) (2015)
M. Davis et al., Detecting anomalies in graphs with numeric labels, in ACM CIKM’11, 24–28 Oct 2011
E. Muller, P.I. Sanchez, Y. Mulle, K. Bohm, Ranking outlier nodes in subspaces of attributed graphs, in IEEE 29th International Conference on Data Engineering Workshops (ICDEW) (2013)
B. Perozzi, L. Akoglu, P. Iglesias Sánchez, E. Müller, Focused clustering and outlier detection in large attributed graphs, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD’14 (2014)
K.S. Kannan, K. Manoj, Outlier detection in multivariate data. Appl. Math. Sci. 9(47), 2317–2324 (2015)
V. Bhatia, B. Saneja, R. Rani, INGC: graph clustering & outlier detection algorithm using label propagation, in International Conference on Machine Learning and Data Science (2017)
D. Kagan, Y. Elovichi, M. Fire, Generic anomalous vertices detection utilizing a link prediction algorithm. Soc. Netw. Anal. Min. 8(1) (2018)
S.E. Schaeffer, Graph clustering: survey. Comput. Sci. Rev. 1, 27–64 (2007)
Z. Chen, Community-based anomaly detection in evolutionary networks. J. Intell. Inf. Syst. Springer Science+Business Media (2011)
R. Jessica et al., A bio-inspired algorithm for searching relationships in social networks, in Proceedings of the 2011 International Conference on Computational Aspects of Social Networks (2011)
J. Yang, J. McAuley, J. Leskovec, Community detection in networks with node attributes, in 2013 IEEE 13th International Conference on Data Mining (2013)
M. Wang et al., Community detection in social networks: an in-depth benchmarking study with a procedure-oriented framework, in Proceedings of the VLDB Endowment, vol. 8, no. 10 (2015)
B. Perozzi, L. Akoglu, Scalable anomaly ranking of attributed neighborhoods, in Proceedings of the 2016 SIAM International Conference on Data Mining (2016)
L. Akoglu, M. McGlohon, C. Faloutsos, Oddball: spotting anomalies in weighted graphs, in Lecture Notes in Computer Science (2010), pp. 410–421
M. Rosvall, Different approaches to community detection. Extended version of the many facets of community detection in complex networks. Appl. Netw. Sci. 2, 4 (2017). arXiv:1712.06468v1
G. Rossetti, R. Guidotti, I. Miliou, D. Pedreschi, F. Giannotti, A supervised approach for intra-/inter-community interaction prediction in dynamic social networks. Soc. Netw. Anal. Min. 6(1) (2016)
M. Sachan, D. Contractor, T.A. Faruquie, L.V. Subramaniam, Using content and interactions for discovering communities in social networks, in Proceedings of the 21st International Conference on World Wide Web (2012)
S. Kumar et al., Community interaction and conflict on the web, in WWW 2018: The 2018 Web Conference, 23–27 Apr 2018
S. Pandhre et al., Community-based outlier detection for edge-attributed graphs. arXiv: 1612.09435v2 [cs.SI] (2017)
Z. Peng, M. Luo, J. Li, H. Liu, Q. Zheng, Anomalous: a joint modeling approach for anomaly detection on attributed networks, in International Joint Conference on Artificial Intelligence (2018), pp. 3513–3519
J. Li, H. Dani, X. Hu, H. Liu, Radar: residual analysis for anomaly detection in attributed networks, in IJCAI (2017)
C. Noble, D. Cook, Graph-based anomaly detection, in ACM SIGKDD, 24–27 Aug 2003
D. Batjargal et al., StarZIP: streaming graph compression technique for data archiving. IEEE Access 1 (2019)
A. Chavan, An introduction to graph compression techniques for in-memory graph computation (2015)
J. Shun, L. Dhulipala, Smaller and faster: parallel processing of compressed graphs with Ligra+ (2015), pp. 403–412
O. Goonetilleke, D. Koutra, T. Sellis, K. Liao, Edge labeling schemes for graph data, in Proceedings of the 29th International Conference on Scientific and Statistical Database Management (SSDBM’17) (United States of America: Association for Computing Machinery, 2017), pp. 1–12
J. Cheng, S. Huang, H. Wu, A. Fu, TF-label: a topological-folding labeling scheme for reachability querying in a large graph, in Proceedings of the ACM SIGMOD International Conference on Management of Data (2013), pp. 193–204
F. Verdoja, M. Grangetto, Graph Laplacian for image anomaly detection. Mach. Vis. Appl. 31, 11 (2020)
K.U. Khan et al., An efficient algorithm for MDL based graph summarization for dense graphs. Contemp. Eng. Sci. 7(16), 791–796 (2014)
D. Koutra, U. Kang, J. Vreeken, C. Faloutsos, Summarizing and understanding large graphs. Stat. Anal. Data Min. ASA Data Sci. J. 8(3), 183–202 (2015)
S. Velampalli et al., Novel graph based anomaly detection using background knowledge, in Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference (2017)
M. Salehi, L. Rashidi, A survey on anomaly detection in evolving data. ACM SIGKDD Explor. Newsl. 20(1), 13–23 (2018)
E. Geepalla, N. Abuhamoud, A. Abouda, Analysis of call detail records for understanding users behavior and anomaly detection using Neo4j, in 5th International Symposium on Data Mining Applications (2018), pp. 74–83
P.I. Gionis, R. Motwani, Similarity search in high dimensions via hashing, in Proceedings of the 25th International Conference on Very Large Data Bases, VLDB’99 (Morgan Kaufmann Publishers Inc., 1999), pp. 518–529
Q. Cheng, Y. Zhou, Y. Feng et al., An unsupervised ensemble framework for node anomaly behavior detection in social network. Soft Comput. (2019)
M. Deepa, M. Rajalakshmi, Survey of deep and extreme learning machines for big data classification. Asian J. Res. Soc. Sci. Humanit. Asian Res. Consortium 6(8), 2502–2512 (2016)
F. Angiulli, C. Pizzuti, Fast Outlier Detection in High Dimensional Spaces, in Springer PKDD. LNAI, vol. 2431 (2002), pp. 15–27
F. Angiulli, C. Pizzuti, Outlier mining in large high-dimensional data sets. IEEE Trans. Knowl. Data Eng. 17(2) (2005)
R.L. Graham, An efficient algorithm for determining the convex hull of a finite planar set. Inf. Process. Lett. 1(4), 132–133 (1972)
L. Grandinetti et al., High-performance computing and big data analysis. Commun. Comput. Inf. Sci. (2019)
H.V. Nguyen, V. Gopalkrishnan, Feature extraction for outlier detection in high-dimensional spaces, in Proceedings of the Fourth International Workshop on Feature Selection in Data Mining. PMLR 10, 66–75 (2010)
J. Gao, F. Liang, W. Fan, C. Wang, Y. Sun, J. Han, On community outliers and their efficient detection in information networks, in KDD (2010), pp. 813–822
R.A. Jarvis, On the identification of the convex hull of a finite set of points in the plane. Inf. Process. Lett. 2, 18–21 (1973)
J.M. Kleinberg et al., Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
K. Senthamarai Kannan et al., Labeling methods for identifying outliers. Int. J. Stat. Syst. 10(2), 231–238 (2015). ISSN 0973-2675
T. Kohonen, Self-organization and associative memory, in Springer Series in Information Sciences (1988)
K. Sugihara, Robust gift wrapping for the three-dimensional convex hull. J. Comput. Syst. Sci. 49, 391–407 (1994)
L. Xu et al., A hierarchical framework using approximated local outlier factor for efficient anomaly detection. Procedia Comput. Sci. 19, 1174–1181 (2013)
M.M. Breunig, H.-P. Kriegel, R. Ng, J. Sander, LOF: identifying density-based local outliers, in SIGMOD’00 (2000), pp. 427–438
S. Maya, K. Ueno, T. Nishikawa, dLSTM: a new approach for anomaly detection using deep learning with delayed prediction. Int. J. Data Sci. Anal. (2019)
N. Billor et al., BACON: blocked adaptive computationally efficient outlier nominators. Comput. Stat. Data Anal. 34, 279–298 (2000)
P. Filzmoser et al., Outlier identification in high dimensions. Preprint submitted to Elsevier Science (2006)
P. Cao et al., A focal any-angle path-finding algorithm based on A* on visibility graphs. arXiv preprint arXiv:1706.03144 (2017)
Qiu et al., A survey of machine learning for big data processing. EURASIP J. Adv. Signal Process. 2016, 67 (2016)
R.E. Kalman, A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 183, 35–45 (1960)
S. Brin et al., The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998)
S. Cateni et al., Outlier detection methods for industrial applications, in Advances in Robotics, Automation and Control (2008), p. 472
Q. Tan, N. Liu, X. Hu, Deep representation learning for social network analysis. Front. Big Data 2, 2 (2019)
T.M. Chan, Optimal output-sensitive convex hull algorithms in two and three dimensions. Discrete Comput. Geom. 16, 361–368 (1996)
Z. Liu, X. Liu, J. Ma, H. Gao, An optimized computational framework for isolation forest. Math. Probl. Eng. (2018)
Z. He et al., Discovering cluster base local outliers. Patten Recogn. Lett. 24(9–10), 1641–1650 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Saranya, S., Rajalakshmi, M. (2022). Certain Strategic Study on Machine Learning-Based Graph Anomaly Detection. In: Shakya, S., Bestak, R., Palanisamy, R., Kamel, K.A. (eds) Mobile Computing and Sustainable Informatics. Lecture Notes on Data Engineering and Communications Technologies, vol 68. Springer, Singapore. https://doi.org/10.1007/978-981-16-1866-6_5
Download citation
DOI: https://doi.org/10.1007/978-981-16-1866-6_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1865-9
Online ISBN: 978-981-16-1866-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)