Abstract
This work presents a novel procedure for computing (1) distances between nodes of a weighted, undirected, graph, called the Euclidean Commute Time Distance (ECTD), and (2) a subspace projection of the nodes of the graph that preserves as much variance as possible, in terms of the ECTD – a principal components analysis of the graph. It is based on a Markov-chain model of random walk through the graph. The model assigns transition probabilities to the links between nodes, so that a random walker can jump from node to node. A quantity, called the average commute time, computes the average time taken by a random walker for reaching node j for the first time when starting from node i, and coming back to node i. The square root of this quantity, the ECTD, is a distance measure between any two nodes, and has the nice property of decreasing when the number of paths connecting two nodes increases and when the “length” of any path decreases. The ECTD can be computed from the pseudoinverse of the Laplacian matrix of the graph, which is a kernel. We finally define the Principal Components Analysis (PCA) of a graph as the subspace projection that preserves as much variance as possible, in terms of the ECTD. This graph PCA has some interesting links with spectral graph theory, in particular spectral clustering.
Chapter PDF
Similar content being viewed by others
Keywords
- Principal Component Analysis
- Random Walk
- Spectral Cluster
- Neural Information Processing System
- Laplacian Matrix
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Barnett, S.: Matrices: Methods and Applications. Oxford University Press, Oxford (1992)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, vol. 14, pp. 585–591. MIT Press, Cambridge (2001)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15, 1373–1396 (2003)
Bollobas, B.: Modern graph theory. Springer, Heidelberg (1998)
Bremaud, P.: Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. Springer, Heidelberg (1999)
Buckley, F., Harary, F.: Distance in graphs. Addison-Wesley Publishing Company, Reading (1990)
Campbell, S., Meyer, C.: Generalized inverses of linear transformations. Pitman Publishing Company (1979)
Chandra, K., Raghavan, P., Ruzzo, W.L., Smolensky, R., Tiwari, P.: The electrical resistance of a graph captures its commute and cover times. In: Annual ACM Symposium on Theory of Computing, pp. 574–586 (1989)
Chung, F.R.: Spectral Graph Theory. American Mathematical Society, Providence (1997)
Deerweester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. The Mathematical Association of America (1984)
Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its applications to graph theory. Czech. Math. J. 25(100), 619–633 (1975)
Gan, H., Fera, D., Zorn, J., Shiffeldrim, N., Tang, M., Laserson, U., Kim, N., Schlick, T.: RAG: RNA-As-Graphs Database - Concepts, Analysis, and Features. Bioinformatics (2004) (to appear)
Gobel, F., Jagers, A.: Random walks on graphs. Stochastic Processes and their Applications 2, 311–336 (1974)
Joachims, T.: Transductive learning via spectral graph partitioning. In: Proceedings of the 20th International Conference on Machine Learning, Washington DC (2003)
Kamvar, S., Klein, D., Manning, C.: Spectral learning. In: Proceedings of the International Joint Conference of Artificial Intelligence (April 2003)
Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer, Heidelberg (1960)
Klein, D.J., Randic, M.: Resistance distance. Journal of Mathematical Chemistry 12, 81–95 (1993)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Lay, D.: Linear algebra and its applications, 3rd edn. Addison-Wesley, Reading (2003)
Meila, M., Shi, J.: A random walks view of spectral segmentation. In: Proceedings of AISTATS (2001)
Mohar, B.: The Laplacian spectrum of graphs. In: Alavi, Y., Chartrand, G., Oellermann, A.S.O.R. (eds.) Graph Theory, Combinatorics, and Applications, vol. 2, pp. 871–898. Wiley, Chichester (1991)
Mohar, B.: Laplace eigenvalues of graphs – a survey. Discrete Mathematics 109, 171–183 (1992)
Ng, Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, Vancouver, Canada, vol. 14, pp. 849–856. MIT Press, Cambridge (2001)
Noble, B., Daniels, J.: Applied linear algebra, 3rd edn. Prentice-Hall, Englewood Cliffs (1988)
Norris, J.: Markov Chains. Cambridge University Press, Cambridge (1997)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report, Computer System Laboratory, Stanford University (1998)
Pothen, A., Simon, H., Liou, K.-P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM Journal of Matrix Analysis 11(3), 430–452 (1990)
Kannan, V.R., Vempala, S.: On clusterings: Good, bad and spectral. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science (2000)
Rao, C., Mitra, S.: Generalized inverse of matrices and its applications. John Wiley and Sons, Chichester (1971)
Saerens, M., Pirotte, A., Fouss, F.: Computing dissimilarities between nodes of a graph: Application to collaborative filtering. Technical Report, IAG, Universite catholique de Louvain (2004), http://www.isys.ucl.ac.be/staff/francois/Articles
Scholkopf, B., Smola, A.: Learning with kernels. The MIT Press, Cambridge (2002)
Shi, J., Malik, J.: Normalised cuts and image segmentation. IEEE Transactions on Pattern Matching and Machine Intelligence 22, 888–905 (2000)
Smola, J., Kondor, R.: Kernels and regularization on graphs. In: Warmuth, M., Schölkopf, B. (eds.) Proceedings of the Conference on Learning Theory and Kernels Workshop (2003)
Szummer, M., Jaakkola, T.: Partially labeled classification with markov random walks. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, Vancouver, Canada, vol. 14, MIT Press, Vancouver (2001)
Weiss, Y.: Segmentation using eigenvectors: a unifying view. In: International Conference on Computer Vision (1999)
Zha, H., Ding, C., Gu, M., He, X., Simon, H.: Spectral relaxation for K-means clustering. In: Dietterich, T., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems, Vancouver, Canada, vol. 14, pp. 1057–1064. MIT Press, Cambridge (2001)
Zheng, A.X., Ng, A.Y., Jordan, M.I.: Stable eigenvector algorithms for link analysis. In: Proceedings of the 24th International Conference on Research and Development in Information Retrieval, pp. 258–296 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saerens, M., Fouss, F., Yen, L., Dupont, P. (2004). The Principal Components Analysis of a Graph, and Its Relationships to Spectral Clustering. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Machine Learning: ECML 2004. ECML 2004. Lecture Notes in Computer Science(), vol 3201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_35
Download citation
DOI: https://doi.org/10.1007/978-3-540-30115-8_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23105-9
Online ISBN: 978-3-540-30115-8
eBook Packages: Springer Book Archive