Abstract
Relations among multiple entities are prevalent in many fields, and hypergraphs are widely used to represent such group relations. Hence, machine learning on hypergraphs has received considerable attention, and especially much effort has been made in neural network architectures for hypergraphs (a.k.a., hypergraph neural networks). However, existing studies mostly focused on small datasets for a few single-entity-level downstream tasks and overlooked scalability issues, although most real-world group relations are large-scale. In this work, we propose new tasks, datasets, and scalable training methods for addressing these limitations. First, we introduce two pair-level hypergraph-learning tasks to formulate a wide range of real-world problems. Then, we build and publicly release two large-scale hypergraph datasets with tens of millions of nodes, rich features, and labels. After that, we propose PCL, a scalable learning method for hypergraph neural networks. To tackle scalability issues, PCL splits a given hypergraph into partitions and trains a neural network via contrastive learning. Our extensive experiments demonstrate that hypergraph neural networks can be trained for large-scale hypergraphs by PCL while outperforming 16 baseline models. Specifically, the performance is comparable, or surprisingly even better than that achieved by training hypergraph neural networks on the entire hypergraphs without partitioning.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Code and Data Availability
The source code used in this paper and the large-scale hypergraph datasets that we build are publicly available at https://github.com/kswoo97/pcl for reproducibility.
Notes
For example, in an unsupervised setting without any given positive example pairs, one can additionally split edges in \(\mathcal {E}'\) to use them as positive example pairs for training.
For example, it can be considered in unsupervised settings especially when the clustering membership is strongly correlated with node features and/or topological information.
Available at https://www.aminer.cn/oag-2-1.
Available at https://github.com/UKPLab/sentence-transformers.
Available at https://scholar.google.com/citations?view_op=top_venues &hl=en &vq=eng. In this taxonomy, a single venue is associated with a single sub-field.
PaToH is a balanced partitioning method. It ensures that all generated partitions are of similar sizes (Çatalyürek and Aykanat 2011), specifically satisfying \(\vert \mathcal {P}_{k}^{\mathcal {V}}\vert \le \frac{(1 + \epsilon )}{\vert \mathcal {P}\vert } \sum _{i=1}^{\vert \mathcal {P}\vert } \vert \mathcal {P}_{i}^{\mathcal {V}}\vert , \ \forall k=1,\cdots ,\vert \mathcal {P}\vert\). As shown in Table 8 in Sect. 6.3.5, partitions obtained by PaToH from real-world hypergraphs are well balanced. Specifically, the standard deviation of the number of nodes in each partition is less than 0.5% of the average number of nodes per partition.
One can set K based on the available amount of space (low K takes more memory consumption in general). Note that the performance of the proposed method is not significantly affected by K, which will be demonstrated in Sect. 6.3.5.
Note that other self-supervised losses (e.g., (Addanki et al. 2021)) can be used alternatively.
Since graph encoders require a graph topology as an input, we convert original hypergraphs into graphs by Clique Expansion, described in Appendix A.2.
This is because partitioning algorithms generally assign such nodes in the same partition.
At each mini-batch (partition) of contrastive learning, we record the GPU memory usage after completing the gradient computation (spec., execute loss.backward() and check the current GPU memory allocation using torch.cuda.memory_allocated(device)). After training an encoder in every mini-batch, we calculate the average GPU memory usage for the current epoch by averaging the usage across all partitions. Finally, we compute the average GPU memory usage across all epochs.
The total contrastive training epochs are 50.
References
Addanki R, Battaglia P, Budden D, et al (2021) Large-scale graph representation learning with very deep GNNs and self-supervision. arXiv:2107.09422, https://doi.org/10.48550/arXiv.2107.09422
Ahmed I, Galoppo T, Hu X et al (2021) Graph regularized autoencoder and its application in unsupervised anomaly detection. IEEE Trans Pattern Anal Mach Intell (TPAMI) 44(8):4110–4124. https://doi.org/10.1109/TPAMI.2021.3066111
Arya D, Gupta DK, Rudinac S, et al (2020) HyperSAGE: Generalizing inductive representation learning on hypergraphs. arXiv:2010.04558, https://doi.org/10.48550/arXiv.2010.04558
Bai S, Zhang F, Torr PH (2021) Hypergraph convolution and hypergraph attention. Pattern Recogn 110(107):637. https://doi.org/10.1016/j.patcog.2020.107637
Benson AR, Abebe R, Schaub MT et al (2018) Simplicial closure and higher-order link prediction. Proceed Nat Acad Sci 115(48):E11221–E11230. https://doi.org/10.1073/pnas.1800683115
Caldwell AE, Kahng AB, Markov IL (2000) Improved algorithms for hypergraph bipartitioning. In: Proceedings of the 2000 Asia and South pacific design automation conference (ASP-DAC), pp 661–666, https://doi.org/10.1109/ASPDAC.2000.835182
Caron E, van Eck NJ (2014) Large scale author name disambiguation using rule-based scoring and clustering. In: Proceedings of the 19th international conference on science and technology indicators (STI), pp 79–86, https://doi.org/10.1007/978-981-32-9298-7_12
Çatalyürek ÜV, Aykanat C (2011) PaToH (partitioning tool for hypergraphs). In: Encyclopedia of parallel computing. Springer, pp 1479–1487, https://doi.org/10.1007/978-0-387-09766-4_93
Chen J, Ma T, Xiao C (2018a) FastGCN: Fast learning with graph convolutional networks via importance sampling. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.1801.10247
Chen J, Zhu J, Song L (2018b) Stochastic training of graph convolutional networks with variance reduction. In: International conference on machine learning (ICML), pp 942–950, https://doi.org/10.48550/arXiv.1710.10568
Chen T, Kornblith S, Norouzi M, et al. (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning (ICML), pp 1597–1607, https://doi.org/10.48550/arXiv.2002.05709
Chiang WL, Liu X, Si S, et al (2019) Cluster-GCN: An efficient algorithm for training deep and large graph convolutional networks. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (KDD), pp 257–266, https://doi.org/10.1145/3292500.3330925
Chien E, Pan C, Peng J, et al (2021) You are AllSet: A multiset function framework for hypergraph neural networks. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.2106.13264
Chodrow PS, Veldt N, Benson AR (2021) Generative hypergraph clustering: from blockmodels to modularity. Sci Adv 28:eabh1303. https://doi.org/10.1126/sciadv.abh1303
Contisciani M, Battiston F, De Bacco C (2022) Inference of hyperedges and overlapping communities in hypergraphs. Nat Commun 13(1):7229. https://doi.org/10.1038/s41467-022-34714-7
Deng K, Xing L, Zheng L et al (2019) A user identification algorithm based on user behavior analysis in social networks. IEEE Access 7:47114–47123. https://doi.org/10.1109/ACCESS.2019.2909089
Do MT, Yoon Se, Hooi B, et al (2020) Structural patterns and generative models of real-world hypergraphs. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (KDD), pp 176–186, https://doi.org/10.1145/3394486.3403060
Dong Y, Sawin W, Bengio Y (2020) HNHN: hypergraph networks with hyperedge neurons. arXiv:2006.12278, https://doi.org/10.48550/arXiv.2006.12278
Feng Y, You H, Zhang Z, et al (2019) Hypergraph neural networks. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), pp 3558–3565, https://doi.org/10.1609/aaai.v33i01.33013558
Fey M, Lenssen JE (2019) Fast graph representation learning with pytorch geometric. arXiv:1903.02428, https://doi.org/10.48550/arXiv.1903.02428
Gao T, Yao X, Chen D (2021) SimCSE: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 conference on empirical methods in natural language processing (EMNLP), pp 6894–6910, https://doi.org/10.18653/v1/2021.emnlp-main.552
Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826. https://doi.org/10.1073/pnas.122653799
Grover A, Leskovec J (2016) Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pp 855–864, https://doi.org/10.1145/2939672.2939754
Grunig G, Durmus N, Zhang Y et al (2022) Molecular clustering analysis of blood biomarkers in world trade center exposed community members with persistent lower respiratory symptoms. Int J Environ Res Public Health 19(13):8102. https://doi.org/10.3390/ijerph19138102
Guo M, Yi T, Zhu Y, et al (2021) JITuNE: Just-in-time hyperparameter tuning for network embedding algorithms. arXiv:2101.06427, https://doi.org/10.48550/arXiv.2101.06427
Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. In: Advances in neural information processing systems (NeurIPS), https://doi.org/10.48550/arXiv.1706.02216
Hassani K, Khasahmadi AH (2020) Contrastive multi-view representation learning on graphs. In: International conference on machine learning (ICML), pp 4116–4126, https://doi.org/10.48550/arXiv.2006.05582
He K, Fan H, Wu Y, et al (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9729–9738, https://doi.org/10.48550/arXiv.1911.05722
Hein M, Setzer S, Jost L, et al (2013) The total variation on hypergraphs-learning on hypergraphs revisited. In: Advances in neural information processing systems (NeurIPS), https://doi.org/10.48550/arXiv.1312.5179
Huang J, Yang J (2021) UniGNN: A unified framework for graph and hypergraph neural networks. In: Proceedings of the Thirtieth international joint conference on artificial intelligence (IJCAI), pp 2563–2569, https://doi.org/10.24963/ijcai.2021/353
Huang W, Zhang T, Rong Y, et al (2018) Adaptive sampling towards fast graph representation learning. In: Advances in neural information processing systems (NeurIPS), https://doi.org/10.48550/arXiv.1809.05343
Hwang H, Lee S, Park C, et al (2022) AHP: learning to negative sample for hyperedge prediction. In: Proceedings of the 45th International ACM SIGIR conference on research and development in information retrieval (SIGIR), pp 2237–2242, https://doi.org/10.1145/3477495.3531836
Jecmen S, Yoon M, Conitzer V, et al (2023) A dataset on malicious paper bidding in peer review. In: Proceedings of the ACM web conference 2023 (WWW), pp 3816–3826, https://doi.org/10.1145/3543507.3583424
Karypis G, Aggarwal R, Kumar V, et al (1997) Multilevel hypergraph partitioning: Application in VLSI domain. In: Proceedings of the 34th annual Design Automation Conference (DAC), pp 526–529, https://doi.org/10.1145/266021.266273
Kim S, Choe M, Yoo J, et al (2022) Reciprocity in directed hypergraphs: measures, findings, and generators. In: IEEE International conference on data mining (ICDM), https://doi.org/10.1109/ICDM54844.2022.00122
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.1412.6980
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.1609.02907
Klamt S, Haus UU, Theis F (2009) Hypergraphs and cellular networks. PLoS Comput Biol 5(5):e1000385. https://doi.org/10.1371/journal.pcbi.1000385
Ko J, Kook Y, Shin K (2022) Growth patterns and models of real-world hypergraphs. Knowl Inf Syst 64(11):2883–2920. https://doi.org/10.1007/s10115-022-01739-9
Konstantinova EV, Skorobogatov VA (2001) Application of hypergraph theory in chemistry. Discret Math 235(1–3):365–383. https://doi.org/10.1016/S0012-365X(00)00290-9
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37. https://doi.org/10.1109/MC.2009.263
Lee D, Shin K (2023) I’m me, we’re us, and i’m us: Tri-directional contrastive learning on hypergraphs. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), https://doi.org/10.48550/arXiv.2206.04739
Lee G, Choe M, Shin K (2021) How do hyperedges overlap in real-world hypergraphs?-patterns, measures, and generators. In: Proceedings of the web conference 2021 (WWW), pp 3396–3407, https://doi.org/10.1145/3442381.3450010
Lee J, Lee Y, Kim J, et al (2019) Set transformer: a framework for attention-based permutation-invariant neural networks. In: International conference on machine learning (ICML), pp 3744–3753, https://doi.org/10.48550/arXiv.1810.00825
Li P, Milenkovic O (2018) Submodular hypergraphs: p-laplacians, cheeger inequalities and spectral clustering. In: International conference on machine learning (ICML), pp 3014–3023, https://doi.org/10.48550/arXiv.1803.03833
Li Z, Huang C, Xia L, et al (2022) Spatial-temporal hypergraph self-supervised learning for crime prediction. In: IEEE 38th international conference on data engineering (ICDE), pp 2984–2996, https://doi.org/10.1109/ICDE53745.2022.00269
Liu Z, Ma Y, Ouyang Y, et al (2021) Contrastive learning for recommender system. arXiv:2101.01317, https://doi.org/10.48550/arXiv.2101.01317
Luo Q, Yu D, Cai Z, et al (2021) Hypercore maintenance in dynamic hypergraphs. In: IEEE 37th international conference on data engineering (ICDE), pp 2051–2056, https://doi.org/10.1109/ICDE51399.2021.00199
Luo X, Ju W, Qu M, et al (2022) DualGraph: Improving semi-supervised graph classification via dual contrastive learning. In: IEEE 38th international conference on data engineering (ICDE), pp 699–712, https://doi.org/10.1109/ICDE53745.2022.00057
Malatras A, Geneiatakis D, Vakalis I (2017) On the efficiency of user identification: a system-based approach. Int J Inf Secur 16(6):653–671. https://doi.org/10.1007/s10207-016-0340-2
Maleki S, Saless D, Wall DP, et al (2022) HyperNetVec: Fast and scalable hierarchical embedding for hypergraphs. In: Network Science (NetSci), Springer, pp 169–183, https://doi.org/10.1007/978-3-030-97240-0_13
Mayer C, Mayer R, Bhowmik S, et al (2018) HYPE: Massive hypergraph partitioning with neighborhood expansion. In: IEEE International conference on big data (Big Data), pp 458–467, https://doi.org/10.1109/BigData.2018.8621968
Milojević S (2013) Accuracy of simple, initials-based methods for author name disambiguation. J Informet 7(4):767–773. https://doi.org/10.1016/j.joi.2013.06.006
Muttakin MN, Hossain MI, Rahman MS (2022) Overlapping community detection using dynamic dilated aggregation in deep residual GCN. arXiv:2210.11174, https://doi.org/10.48550/arXiv.2210.11174
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv:1807.03748, https://doi.org/10.48550/arXiv.1807.03748
Paszke A, Gross S, Massa F, et al (2019) PyTorch: An imperative style, high-performance deep learning library. In: Advances in Neural information processing systems (NeurIPS), https://doi.org/10.48550/arXiv.1912.01703
Pennington J, Socher R, Manning CD (2014) GloVe: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543, https://doi.org/10.3115/v1/D14-1162
Qu C, Tao M, Yuan R (2018) A hypergraph-based blockchain model and application in internet of things-enabled smart homes. Sensors 18(9):2784. https://doi.org/10.3390/s18092784
Robinson J, Chuang CY, Sra S, et al (2021) Contrastive learning with hard negative samples. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.2010.04592
Rossetti G, Cazabet R (2018) Community discovery in dynamic networks: a survey. ACM Comput Surv 51(2):1–37. https://doi.org/10.1145/3172867
Rossi E, Frasca F, Chamberlain B, et al (2020) SIGN: scalable inception graph neural networks. arXiv:2004.11198, https://doi.org/10.48550/arXiv.2004.11198
Rossi R, Ahmed N (2015) The network data repository with interactive graph analytics and visualization. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), https://doi.org/10.1609/aaai.v29i1.9277
Ruan B, Gan J, Wu H, et al (2021) Dynamic structural clustering on graphs. In: Proceedings of the 2021 international conference on management of data (SIGMOD), pp 1491–1503, https://doi.org/10.1145/3448016.3452828
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536. https://doi.org/10.1038/323533a0
Sanyal DK, Bhowmick PK, Das PP (2021) A review of author name disambiguation techniques for the PubMed bibliographic database. J Inf Sci 47(2):227–254. https://doi.org/10.1177/0165551519888605
Schlag S, Heuer T, Gottesbüren L et al (2023) High-quality hypergraph partitioning. ACM J Exp Algorithmics 27:1–39. https://doi.org/10.1145/3529090
Shchur O, Günnemann S (2019) Overlapping community detection with graph neural networks. arXiv:1909.12201, https://doi.org/10.48550/arXiv.1909.12201
Sinha A, Shen Z, Song Y, et al (2015) An overview of microsoft academic service (MAS) and applications. In: Proceedings of the 24th international conference on world wide web (WWW), pp 243–246, https://doi.org/10.1145/2740908.2742839
Tang J, Zhang J, Yao L, et al (2008) ArnetMiner: Extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pp 990–998, https://doi.org/10.1145/1401890.1402008
Torres L, Blevins AS, Bassett D et al (2021) The why, how, and when of representations for complex systems. SIAM Rev 63(3):435–485. https://doi.org/10.1137/20M1355896
Tsitsulin A, Palowitch J, Perozzi B, et al (2020) Graph clustering with graph neural networks. arXiv:2006.16904, https://doi.org/10.48550/arXiv.2006.16904
Tynes M, Gao W, Burrill DJ et al (2021) Pairwise difference regression: a machine learning meta-algorithm for improved prediction and uncertainty quantification in chemical search. J Chem Inf Model 61(8):3846–3857. https://doi.org/10.1021/acs.jcim.1c00670
Veličković P, Fedus W, Hamilton WL, et al (2018) Deep graph infomax. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.1809.10341
Wang K, Shen Z, Huang C et al (2020) Microsoft academic graph: when experts are not enough. Quant Sci Stud 1(1):396–413. https://doi.org/10.1162/qss_a_00021
Wang Z, Zheng L, Li Y, et al (2019) Linkage based face clustering via graph convolution network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 1117–1125, https://doi.org/10.1109/CVPR.2019.00121
Wu F, Souza A, Zhang T, et al (2019) Simplifying graph convolutional networks. In: International conference on machine learning (ICML), pp 6861–6871, https://doi.org/10.48550/arXiv.1902.07153
Xia X, Yin H, Yu J, et al (2021) Self-supervised hypergraph convolutional networks for session-based recommendation. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), pp 4503–4511, https://doi.org/10.1609/aaai.v35i5.16578
Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv 45(4):1–35. https://doi.org/10.1145/2501654.2501657
Xie X, Sun F, Liu Z, et al (2022) Contrastive learning for sequential recommendation. In: IEEE 38th International conference on data engineering (ICDE), pp 1259–1273, https://doi.org/10.1109/ICDE53745.2022.00099
Yadati N, Nimishakavi M, Yadav P, et al (2019) HyperGCN: a new method of training graph convolutional networks on hypergraphs. In: Advances in neural information processing systems (NeurIPS), pp 1509–1520, https://doi.org/10.48550/arXiv.1809.02589
Yadati N, Nitin V, Nimishakavi M, et al (2020) NHP: neural hypergraph link prediction. In: Proceedings of the 29th ACM international conference on information & knowledge management (CIKM), pp 1705–1714, https://doi.org/10.1145/3340531.3411870
Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: Proceedings of the fourth ACM international conference on Web search and data mining (WSDM), pp 177–186, https://doi.org/10.1145/1935826.1935863
Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the sixth ACM international conference on Web search and data mining (WSDM), pp 587–596, https://doi.org/10.1145/2433396.2433471
Yin N, Feng F, Luo Z, et al (2022) Dynamic hypergraph convolutional network. In: IEEE 38th international conference on data engineering (ICDE), pp 1621–1634, https://doi.org/10.1109/ICDE53745.2022.00167
You Y, Chen T, Sui Y, et al (2020) Graph contrastive learning with augmentations. In: Advances in neural information processing systems (NeurIPS), pp 5812–5823, https://doi.org/10.48550/arXiv.2010.13902
Yu J, Yin H, Li J, et al (2021) Self-supervised multi-channel hypergraph convolutional network for social recommendation. In: Proceedings of the web conference 2021 (WWW), pp 413–424, https://doi.org/10.1145/3442381.3449844
Zaheer M, Kottur S, Ravanbhakhsh S, et al (2017) Deep sets. In: Advances in neural information processing systems (NeurIPS), https://doi.org/10.48550/arXiv.1703.06114
Zeng H, Zhou H, Srivastava A, et al (2019) GraphSAINT: Graph sampling based inductive learning method. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.1907.04931
Zhang D, Huang X, Liu Z, et al (2020a) AGL: A scalable system for industrial-purpose graph machine learning. Proc VLDB Endow (PVLDB) 13(12): 3125–3137. https://doi.org/10.14778/3415478.3415539
Zhang F, Liu X, Tang J, et al (2019) OAG: Toward linking large-scale heterogeneous entity graphs. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (KDD), pp 2585–2595, https://doi.org/10.1145/3292500.3330785
Zhang J, Gao M, Yu J, et al (2021) Double-scale self-supervised hypergraph learning for group recommendation. In: Proceedings of the 30th ACM international conference on information & knowledge management (CIKM), pp 2557–2567, https://doi.org/10.1145/3459637.3482426
Zhang J, Li F, Xiao X, et al (2022) Hypergraph convolutional networks via equivalency between hypergraphs and undirected graphs. arXiv:2203.16939, https://doi.org/10.48550/arXiv.2203.16939
Zhang R, Zou Y, Ma J (2020b) Hyper-SAGNN: a self-attention based graph neural network for hypergraphs. In: International conference on learning representations (ICLR), https://doi.org/10.48550/arXiv.1911.02613
Zhang S, Tong H (2016) FINAL: Fast attributed network alignment. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data (KDD), pp 1345–1354, https://doi.org/10.1145/2939672.2939766
Zheng C, Chen H, Cheng Y, et al. (2022a) ByteGNN: efficient graph neural network training at large scale. Proc VLDB Endow (PVLDB) 15(6):1228–1242.https://doi.org/10.14778/3514061.3514069
Zheng D, Ma C, Wang M, et al (2020) DistDGL: Distributed graph neural network training for billion-scale graphs. In: IEEE/ACM 10th workshop on irregular applications: architectures and algorithms (IA3), pp 36–44, https://doi.org/10.1109/IA351965.2020.00011
Zheng Y, Pan S, Lee VC, et al (2022b) Rethinking and scaling up graph contrastive learning: an extremely efficient approach with group discrimination. In: Advances in neural information processing systems (NeurIPS), pp 10809–10820, https://doi.org/10.48550/arXiv.2206.01535
Zhu Y, Xu Y, Yu F, et al (2020) Deep graph contrastive representation learning. arXiv:2006.04131, https://doi.org/10.48550/arXiv.2006.04131
Zhu Y, Xu Y, Yu F, et al (2021) Graph contrastive learning with adaptive augmentation. In: Proceedings of the web conference 2021 (WWW), pp 2069–2080, https://doi.org/10.1145/3442381.3449802
Funding
This work was supported by Samsung Electronics Co., Ltd., National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. NRF-2020R1C1C1008296), and Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2022-0-00157, Robust, Fair, Extensible Data-Centric Continual Learning) (No. 2019-0-00075, Artificial Intelligence Graduate School Program (KAIST)).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Responsible editor: Charalampos Tsourakakis.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Additional experimental settings
Appendix: Additional experimental settings
1.1 Details of datasets
DBLP is a co-authorship hypergraph where nodes and hyperedges correspond to publications and authors, respectively. Each publication’s class is labeled according to its field of study. Trivago is a hotel-web search hypergraph where each node indicates each hotel and each hyperedge corresponds to a user. If a user (hyperedge) has visited the website of a particular hotel (node), the corresponding node is added to the respective user hyperedge. Furthermore, each hotel’s class is labeled based on the country in which it is located. OGBN-MAG is originally a heterogeneous graph that contains comprehensive academic information including venue, author, publication, and affiliation information. We transform this heterogeneous graph into a hypergraph as described in Sect. 4, while a label of each node (publication) indicates a published venue of the corresponding publication.
1.2 Details of graph-based baseline methods
Since graph representation models (Kipf and Welling 2017; Veličković et al. 2018) require ordinary graph structure as an input, we transform original hypergraph datasets into ordinary graph datasets by using clique expansion, where each hyperedge is replaced with a clique in the resulting graph. Formally, the clique expansion is a transformation of a given hyperedge set \(\mathcal {E}\) to a clique expanded edge set \(\mathcal {E}_{G} = \bigcup _{e \in \mathcal {E}} \left( {\begin{array}{c}e\\ 2\end{array}}\right)\).
Specifically, for full-graph datasets of DBLP and Trivago, we directly obtain \(\mathcal {E}_{G}\) from \(\mathcal {E}\), the entire hyperedge set. For full-graph datasets of OGBN-MAG, the size of the resulting clique expanded edges is too large to be loaded into the main memory. To reduce its scale, we additionally employ sampling, as described in Algorithm 2. Specifically, for each hyperedge \(e'\) whose size is greater than k and for each constituent node \(v \in e'\), we uniformly sample k other nodes from \(e'\) (line 7) and create k edges joining v and each of the k sampled nodes. Here, we set \(k=10\) for the OGBN-MAG dataset. We fail to create full-graph datasets of AMiner and MAG since clique expansion runs out of memory even with small k around 3, and thus we cannot perform experiments on them.
For partitioned-graph datasets of DBLP, Trivago, and OGBN-MAG, we apply clique expansion to the hyperedge set in each partition and use the resulting clique-expanded edge set as that of the corresponding partition. For partitioned-graph datasets of AMiner and MAG, due to the scalability issue, we apply the sampling strategy described in Algorithm 2 to each partition \(\mathcal {P}_{i}\) of \(\varvec{\mathcal {P}}\) (i.e., the input is \(\mathcal {P}^{\mathcal {E}}_{i}\) instead of \(\mathcal {E}\)) and treat the resulting edge set as the edge set of the corresponding partition. Here, we set k to 10.
1.3 Details of hyperaprameter settings
We now provide detailed hyperparameter settings of representation models and training methods. The number of layers and hidden dimension of all representation models are fixed to 2 and 128, respectively.
For representation models that are trained via supervised learning methods, we train each model for 100 epochs. We tune a learning rate of each model within \(\{0.01, 0.001, 0.0001\}\). For every 10 epochs, we measure the validation AP score and save the model parameters. Then, we designate the checkpoint with the highest validation AP score as the final model parameters.
For representation models that are trained via all versions of PCL, we tune the number of self-supervised learning epochs within \(\{25, 50\}\), while we set a broader search space, specifically \(\{20, 40, 60, 80, 100\}\), for that of other self-supervised learning methods. We tune the learning rate of the self-supervised learning within \(\{0.001, 0.0001\}\) for all self-supervised learning methods. In addition, for methods that require augmentation steps, we tune the extent of node feature augmentation \(p_{v}\) within \(\{0.3, 0.4\}\), and the extent of topological augmentation \(p_{e}\) within \(\{0.3, 0.4\}\). Furthermore, for methods that require negative samples for contrastive learning, we tune the number of negative samples N within \(\{1, 2\}\). The temperature parameter \(\tau\) for all self-supervised learning methods, and the scalar \(\lambda\) that controls the strength of inter-partition loss in PCL+PINS are both fixed to 0.5. Lastly, we train downstream task classifiers of all self-supervised learning methods with a learning rate of 0.001. We train the classifiers for 100 epochs, and for every 10 epochs, we measure the validation AP score and save the classifier parameters. Then, we designate the checkpoint with the highest validation AP score as the final classifier parameters.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kim, S., Lee, D., Kim, Y. et al. Datasets, tasks, and training methods for large-scale hypergraph learning. Data Min Knowl Disc 37, 2216–2254 (2023). https://doi.org/10.1007/s10618-023-00952-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-023-00952-6