Abstract
To what extent can changes in PageRank’s damping factor affect node ranking? We prove that, at least on some graphs, the top k nodes assume all possible k! orderings as the damping factor varies, even if it varies within an arbitrarily small interval (e.g. [0.84999, 0.85001]). Thus, the rank of a node for a given (finite set of discrete) damping factor(s) provides very little information about the rank of that node as the damping factor varies over a continuous interval.
We bypass this problem introducing lineage analysis and proving that there is a simple condition, with a “natural” interpretation independent of PageRank, that allows one to verify “in one shot” if a node outperforms another simultaneously for all damping factors and all damping variables (informally, time variant damping factors). The novel notions of strong rank and weak rank of a node provide a measure of the fuzziness of the rank of that node, of the objective orderability of a graph’s nodes, and of the quality of results returned by different ranking algorithms based on the random surfer model.
We deploy our analytical tools on a 41M node snapshot of the .it Web domain and on a 0.7M node snapshot of the CiteSeer citation graph. Among other findings, we show that rank is indeed relatively stable in both graphs; that “classic” PageRank (d = 0.85) marginally outperforms Weighted In-degree (d→0), mainly due to its ability to ferret out “niche” items; and that, for both the Web and CiteSeer, the ideal damping factor appears to be 0.8 − 0.9 to obtain those items of high importance to at least one (model of randomly surfing) user, but only 0.5 − 0.6 to obtain those items important to every (model of randomly surfing) user.
This work was supported in part by MIUR under PRIN Mainstream and by EU under Integr. Proj. AEOLUS (IP-FP6-015964).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
CiteSeer metadata, http://citeseer.ist.psu.edu/oai.html
Avrachenkov, K., Litvak, N., Son Pham, K.: A singular perturbation approach for choosing PageRank damping factor. ArXiv Mathematics e-prints (2006)
Bacchin, M., Ferro, N., Melucci, M.: The effectiveness of a graph-based algorithm for stemming. In: Lim, E.-p., Foo, S.S.-B., Khoo, C., Chen, H., Fox, E., Urs, S.R., Costantino, T. (eds.) ICADL 2002. LNCS, vol. 2555, pp. 117–128. Springer, Heidelberg (2002)
Baeza-Yates, R., Boldi, P., Castillo, C.: Generalizing PageRank: Damping functions for link-based ranking algorithms. In: Proc. ACM SIGIR 2006 (2006)
Berry, M.W.: Survey of Text Mining. Springer, Heidelberg (2003)
Boldi, P., Santini, M., Vigna, S.: PageRank as a function of the damping factor. In: Proc. ACM WWW 2005 (2005)
Boldi, P., Vigna, S.: The WebGraph framework I: Compression techniques. In: Proc. of the Thirteenth International World Wide Web Conference (WWW 2004), Manhattan, USA, pp. 595–601. ACM Press, New York (2004)
Cho, J., García-Molina, H., Page, L.: Efficient crawling through URL ordering. Computer Networks and ISDN Systems 30(1-7), 161–172 (1998)
Erkan, G., Radev, D.R.: Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research 22, 457–479 (2004)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proc. ACM SODA (2003)
Haveliwala, T.H.: Efficient computation of pagerank. Technical report (1999)
Jiang, X.M., Xue, G.R., Zeng, H.J., Chen, Z., Song, W.-G., Ma, W.-Y.: Exploiting pageRank at different block level. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds.) WISE 2004. LNCS, vol. 3306, pp. 241–252. Springer, Heidelberg (2004)
Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating pagerank computations. In: Proceedings of WWW, pp. 261–270. ACM, New York (2003)
Kamvar, S.D., Schlosser, M.T., Garcia-Molina, H.: The eigentrust algorithm for reputation management in p2p networks. In: Proc. of ACM WWW 2003(2003)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Math. 1(3), 335–380 (2004)
Langville, A.N., Meyer, C.D.: Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, Princeton (2006)
Melucci, M., Pretto, L.: PageRank: When order changes. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 581–588. Springer, Heidelberg (2007)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Technical report, Stanford Dig. Libr. Tech. Proj. (1998)
Peserico, E., Bressan, M.: Choose the Damping, Choose the Ranking? Technical report, Univ. Padova (2008), http://www.dei.unipd.it/~enoch/papers/damprank.pdf
Peserico, E., Pretto, L.: What does it mean to converge in rank? In: Proc. ICTIR 2007 (2007)
Pretto, L.: A theoretical analysis of google’s PageRank. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 131–144. Springer, Heidelberg (2002)
Chakrabarti, S., Dom, B.E., Gibson, D., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Experiments in topic distillation. In: Proc. ACM SIGIR Workshop on Hypertext IR on the Web (1998)
Tarau, P., Mihalcea, R., Figa, E.: Semantic document engineering with wordnet and PageRank. In: Proc. ACM SAC 2005 (2005)
University of Milan. Laboratory of Web Algorithmics, http://law.dsi.unimi.it/
Wangk, K.W.: Item selection by ’hub-authority’ profit ranking. In: Proc. ACM SIGKDD 2002 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bressan, M., Peserico, E. (2009). Choose the Damping, Choose the Ranking?. In: Avrachenkov, K., Donato, D., Litvak, N. (eds) Algorithms and Models for the Web-Graph. WAW 2009. Lecture Notes in Computer Science, vol 5427. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-95995-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-95995-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-95994-6
Online ISBN: 978-3-540-95995-3
eBook Packages: Computer ScienceComputer Science (R0)