Abstract
Densest Subgraph Discovery (DSD) is a fundamental and challenging problem in the field of graph mining in recent years. The DSD aims to determine, given a graph G, the subgraph with the maximum density according to the specified definition. The majority of current research focuses on identifying DSD in homogeneous information networks. Nevertheless, these techniques cannot be applied directly in Heterogeneous Information Networks (HINs) since the semantics of paths are not considered. This limits the application of these approaches in certain fields, as many graphs, e.g., the DBLP dataset, comprise of many types of edges. In order to remedy this research need, this paper proposes approaches for resolving the DSD issue over HINs.
By examining numerous linkage paths between two vertices, we characterize the relationship between two objects of the same type using the concept of meta-path. We further build two new HIN models and provide DSD techniques that perform well in large networks. In addition, comprehensive experiments on various HIN datasets demonstrate the effectiveness and efficiency of the proposed methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahmani, B., Kumar, R., Vassilvitskii, S.: Densest subgraph in streaming and MapReduce. In: PVLDB, pp. 454–46 (2012)
Batagelj, V., Zaversnik, M.: An o(m) algorithm for cores decomposition of networks. arXiv preprint cs/0310049 (2003)
Cantador, I., Brusilovsky, P., Kuflik, T.: 2nd workshop on information heterogeneity and fusion in recommender systems (HetRec 2011). In: Proceedings of the 5th ACM Conference on Recommender Systems, RecSys 2011, New York, NY, USA (2011)
Charikar, M.: Greedy approximation algorithms for finding dense components in a graph. In: Jansen, K., Khuller, S. (eds.) APPROX 2000. LNCS, vol. 1913, pp. 84–95. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44436-X_10
Chen, J., Saad, Y.: Dense subgraph extraction with application to community detection. TKDE 24(7), 1216–1230 (2012)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32(5), 1338–1355 (2003)
Dongen, S.M.V.: Graph clustering by flow simulation. Ph.D. thesis, University of Utrecht (2000)
Epasto, A., Lattanzi, S., Sozio, M.: Efficient densest subgraph computation in evolving graphs. In: WWW, pp. 300–310 (2015)
Fang, Y., Yang, Y., Zhang, W., Lin, X., Ca, X.: Effective and efficient community search over large heterogeneous information networks. Proc. VLDB Endow. 13(6), 854–867 (2020)
Fratkin, E., Naughton, B.T., Brutlag, D.L., Batzoglou, S.: MotifCut: regulatory motifs finding with maximum density subgraphs. Bioinformatics 22(14), e150–e157 (2006)
Goldberg, A.V.: Finding a maximum density subgraph. UC Berkeley (1984)
Harper, F.M., Konstan, J.A.: The MovieLens datasets: history and context. ACM Trans. Interact. Intell. Syst. 5, 19:1–19:19 (2015). https://doi.org/10.1145/2827872
He, R., McAuley, J.: Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In: WWW (2016). https://cseweb.ucsd.edu/jmcauley/datasets.html#amazon_reviews
Heineman, G., Pollice, G., Selkow, S.: Network flow algorithms. Algorithms in a Nutshell (2008)
Huang, Z., Zheng, Y., Cheng, R., Sun, Y., Mamoulis, N., Li, X.: Meta structure: computing relevance in large heterogeneous information networks. In: KDD, pp. 1595–1604 (2016)
Schloss Dagstuhl - Leibniz Center for Informatics.: DBLP computer science bibliography (2019). http://dblp.uni-trier.de/xml
Khuller, S., Saha, B.: On finding dense subgraphs. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009. LNCS, vol. 5555, pp. 597–608. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02927-1_50
McAuley, J., Targett, C., Shi, J., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: SIGIR (2015). https://cseweb.ucsd.edu/jmcauley/datasets.html#amazon_reviews
Qin, L., Li, R.H., Chang, L., Zhang, C.: Locally densest subgraph discover. In: KDD, pp. 965–974 (2015)
S. M. of Science Innovation, the Regional Government of Madrid: International Workshop on Information Heterogeneity and Fusion in Recommender Systems (2011). http://www.lastfm.com
Scott, J.: Social Network Analysis: A Handbook. Sage Publications (2000)
Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: PathSim: meta path-based top-k similarity search in heterogeneous information networks. PVLDB 4(11), 992–1003 (2011)
Tang, J., Qu, M., Mei, Q.: Predictive text embedding through large-scale heterogeneous text networks. In: SIGKDD (2015)
Tsourakakis, C., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: KDD, pp. 104–112 (2013)
Yan, Y., Wang, Q., Liu, L.: Latent influence based self-attention framework for heterogeneous network embedding. IEICE Trans. Inf. Syst. 105, 1335–1339 (2022)
Yang, D., Zhang, D., Yu, Z., Yu, Z.: Fine-grained preference-aware location search leveraging crowdsourced digital footprints from LBSNs. In: ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 8–12 (2013)
Zhang, B., Nie, T., Shen, D., Kou, Y., Yu, G., Zhou, Z.: A graph clustering algorithm for citation networks. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds.) APWeb 2016. LNCS, vol. 9932, pp. 414–418. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45817-5_37
Zhang, X., Xie, H., Lui, J.C.S.: Improving bandit learning via heterogeneous information networks: algorithms and applications. ACM Trans. Knowl. Discov. Data 16, 1–25 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yin, H., Wang, K., Zhang, W., Wen, D., Wang, X., Zhang, Y. (2024). Discovering Densest Subgraph over Heterogeneous Information Networks. In: Bao, Z., Borovica-Gajic, R., Qiu, R., Choudhury, F., Yang, Z. (eds) Databases Theory and Applications. ADC 2023. Lecture Notes in Computer Science, vol 14386. Springer, Cham. https://doi.org/10.1007/978-3-031-47843-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-47843-7_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47842-0
Online ISBN: 978-3-031-47843-7
eBook Packages: Computer ScienceComputer Science (R0)