Abstract
This paper presents a new mathematical programming model and a solution approach for a special class of graph partitioning problem. The problem studied here is in the context of distributed web search, in which a very large world-wide-web graph is partitioned to improve the efficiency of webpage ranking (known as PageRank). Although graph partitioning problems have been widely studied and there have been several computational algorithms and mathematical programming models in the literature, the graph partitioning problem for PageRank imposes unique constraints on the density balance. This problem is called the min-cut density-balanced partitioning problem. In this paper, we propose a new mathematical programming model and a solution approach to efficiently solve this min-cut density-balanced partitioning problem. As the objective on the minimum cut and the constraint on the density balance are not the direct performance measure of PageRank, we also investigate the performance of the solutions obtained from a MIP solver and our approach on the ranking’s accuracy and the local ranking’s computation times. The experiment results show both solutions are comparable in terms of the ranking’s accuracy and the local ranking’s computation times whereas it is much faster to obtain the partitioning solutions using our approach.
Similar content being viewed by others
References
Andreev, K., Racke, H.: Balanced graph partitioning. Theory Comput. Syst. 39, 929–939 (2006)
Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 1456–1465. ACM, New York (2014). https://doi.org/10.1145/2623330.2623660
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, pp. 107–117. Elsevier Science Publishers B. V., Amsterdam (1998). http://dl.acm.org/citation.cfm?id=297805.297827
Datta, D., Figueira, J.R.: Graph partitioning by multi-objective real-valued metaheuristics: a comparative study. Appl. Soft Comput. 11(5), 3976–3987 (2011). https://doi.org/10.1016/j.asoc.2011.01.044
Even, G., Naor, J.S., Rao, S., Schieber, B.: Fast approximate graph partitioning algorithms. In: Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 639–648. Society for Industrial and Applied Mathematics, Philadelphia (1997)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top \(k\) lists. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pp. 28–36 (2003)
Feige, U., Krauthgamer, R.: A polylogarithmic approximation of the minimum bisection. SIAM J. Comput. 31(4), 1090–1118 (2002). https://doi.org/10.1137/S0097539701387660
Figueiredo, R., Moura, G.: Mixed integer programming formulations for clustering problems related to structural balance. Social Netw. 35(4), 639–651 (2013). https://doi.org/10.1016/j.socnet.2013.09.002
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1990)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999). https://doi.org/10.1145/324133.324140
Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)
Parreira, J.X., Weikum, G.: JXP global authority scores in a P2P network. In: Proceedings of the Eight International Workshop on the Web and Databases (WebDB 2005), pp. 31–36. Baltimore, Maryland (2005)
Salam, B., Salman, F.S., Sayn, S., Terkay, M.: A mixed-integer programming approach to the clustering problem with an application in customer segmentation. Eur. J. Oper. Res. 173(3), 866–879 (2006)
Sangamuang, S., Boonma, P., Natwichai, J.: An efficient algorithm for density-balanced partitioning in distributed pagerank. In: 2014 9th International Conference on Digital Information Management, ICDIM 2014, pp. 118–123 (2014)
Sangamuang, S., Boonma, P., Natwichai, J.: An Algorithm for Min-Cut Density-Balanced Partitioning in P2P Web Ranking, pp. 257–266. Springer, Cham (2015)
Schloegel, K., Karypis, G., Kumar, V.: A New Algorithm for Multi-objective Graph Partitioning, pp. 322–331. Springer, Berlin (1999)
Shi, S., Yu, J., Yang, G., Wang, D.: Distributed page ranking in structured p2p networks. In: Proceedings of the 2003 International Conference on Parallel Processing (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sangamuang, S., Boonma, P., Natwichai, J. et al. Impact of minimum-cut density-balanced partitioning solutions in distributed webpage ranking. Optim Lett 14, 521–533 (2020). https://doi.org/10.1007/s11590-019-01399-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-019-01399-9