Abstract
Graph sampling is crucial for analyzing and understanding large-scale networks across various domains. While numerous approaches have been proposed in the existing literature, a comprehensive evaluation of these methods, with regards to both quality and execution time, is still needed. This paper addresses this gap by offering an exhaustive review of current graph sampling techniques and by introducing three distinct modifications to the Mino-centric graph sampling (MCGS) method. These modified algorithms, along with established methods, are rigorously evaluated through a quantitative analysis that encompasses two comparative iterations and multiple metrics. In addition to the quantitative analysis, we also conduct a qualitative user study, where survey participants assess the quality of the sampling from a visual perspective. Our findings indicate that one of our modified versions of the MCGS algorithm, namely Batch major CC MCGS, not only outperforms other methods in the context of visual evaluation but also significantly optimizes execution time in comparison with the original MCGS algorithm. This improvement equips researchers and practitioners with a powerful tool for exploring large-scale networks in diverse fields.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahmed, N., Neville, J., Kompella, R.R.: Network sampling via edge-based node selection with graph induction. Technical report, Purdue University (2011)
Ahmed, N.K., Neville, J., Kompella, R.: Network sampling: from static to streaming graphs. ACM Trans. Knowl. Discov. Data 8(2), 1–56 (2013)
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network flows (1988)
Barabási, A.L.: Network science. Philos. Trans. Roy. Soc. A: Math. Phys. Eng. Sci. 371(1987), 20120375 (2013)
Bholowalia, P., Kumar, A.: EBK-means: a clustering technique based on elbow method and K-means in WSN. Int. J. Comput. Appl. 105(9), 17–24 (2014)
Cosmograph: Interactive network visualization (2023). https://cosmograph.app. Accessed 19 July 2023
Hardiman, S.J., Katzir, L.: Estimating clustering coefficients and size of social networks via random walk. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 539–550 (2013)
Hu, J., et al.: Bc tree-based spectral sampling for big complex network visualization. Appl. Netw. Sci. 6(1), 60 (2021)
Lee, S.H., Kim, P.J., Jeong, H.: Statistical properties of sampled networks. Phys. Rev. E 73(1), 016102 (2006)
Leskovec, J., Huttenlocher, D., Kleinberg, J.: Signed networks in social media. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1361–1370 (2010)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 177–187 (2005)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data 1(1), 2-es (2007)
Leskovec, J., Rajaraman, A., Ullman, J.D.: Mining of Massive Data Sets. Cambridge University Press, Cambridge (2020)
Leskovec, J., Sosič, R.: SNAP: a general-purpose network analysis and graph-mining library. ACM Trans. Intell. Syst. Technol. 8(1), 1–20 (2016)
McAuley, J., Leskovec, J.: Learning to discover social circles in ego networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 539–547 (2012)
McCallum, A.K., Nigam, K., Rennie, J., Seymore, K.: The Cora dataset. https://paperswithcode.com/dataset/cora. Accessed 19 July 2023
Newman, M.E.J.: The structure of scientific collaboration networks. Proc. Natl. Acad. Sci. 98(2), 404–409 (2001)
Noble, C.C., Cook, D.J.: Graph-based anomaly detection. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 631–636 (2003)
Rezvanian, A., Rahmati, M., Meybodi, M.R.: Sampling from complex networks using distributed learning automata. Phys. A 396, 224–234 (2014)
Rozemberczki, B., Allen, C., Sarkar, R.: Multi-scale attributed node embedding. J. Complex Netw. 9(2), cnab014 (2021)
Rozemberczki, B., Sarkar, R.: Characteristic functions on graphs: birds of a feather, from statistical descriptors to parametric models. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 1325–1334 (2020)
Stumpf, M.P.H., Wiuf, C.: Sampling properties of random graphs: the degree distribution. Phys. Rev. E 72(3), 036118 (2005)
Uehara, R.: The number of connected components in graphs and its applications. Technical report, Komazawa University (1999)
Wu, Y., Cao, N., Archambault, D., Shen, Q., Qu, H., Cui, W.: Evaluation of graph sampling: a visualization perspective. IEEE Trans. Visual Comput. Graphics 23(1), 401–410 (2016)
Zhao, Y., et al.: Mino-centric graph sampling (MCGS). GitHub repository (2020). https://github.com/csuvis/MCGS. Accessed 19 July 2023
Zhao, Y., et al.: Preserving minority structures in graph sampling. IEEE Trans. Visual Comput. Graphics 27(2), 1698–1708 (2020)
Acknowledgements
We thank the survey participants for their time and contribution to our research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Khalafyan, G., Tirosyan, I., Yeghiazaryan, V. (2024). Visualization-Driven Graph Sampling Strategy for Exploring Large-Scale Networks. In: Ignatov, D.I., et al. Analysis of Images, Social Networks and Texts. AIST 2023. Lecture Notes in Computer Science, vol 14486. Springer, Cham. https://doi.org/10.1007/978-3-031-54534-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-54534-4_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54533-7
Online ISBN: 978-3-031-54534-4
eBook Packages: Computer ScienceComputer Science (R0)