[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3126908.3126921acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Geometry-oblivious FMM for compressing dense SPD matrices

Published: 12 November 2017 Publication History

Abstract

We present GOFMM (geometry-oblivious FMM), a novel method that creates a hierarchical low-rank approximation, or "compression," of an arbitrary dense symmetric positive definite (SPD) matrix. For many applications, GOFMM enables an approximate matrix-vector multiplication in N log N or even N time, where N is the matrix size. Compression requires N log N storage and work. In general, our scheme belongs to the family of hierarchical matrix approximation methods. In particular, it generalizes the fast multipole method (FMM) to a purely algebraic setting by only requiring the ability to sample matrix entries. Neither geometric information (i.e., point coordinates) nor knowledge of how the matrix entries have been generated is required, thus the term "geometry-oblivious." Also, we introduce a shared-memory parallel scheme for hierarchical matrix computations that reduces synchronization barriers. We present results on the Intel Knights Landing and Haswell architectures, and on the NVIDIA Pascal architecture for a variety of matrices.

References

[1]
Emmanuel Agullo, Eric Darve, Luc Giraud, and Yuval Harness. 2016. Nearly optimal fast preconditioning of symmetric positive definite matrices. Ph.D. Dissertation. Inria Bordeaux Sud-Ouest.
[2]
Sivaram Ambikasaran. 2013. Fast Algorithms for Dense Numerical Linear Algebra and Applications. Ph.D. Dissertation. Stanford University.
[3]
Sivaram Ambikasaran and Eric Darve. 2013. An O(N log N) Fast Direct Solver for Partial Hierarchically Semi-Separable Matrices. Journal of Scientific Computing 57, 3 (2013), 477--501.
[4]
Mario Bebendorf. 2008. Hierarchical matrices. Springer.
[5]
Mario Bebendorf and Sergej Rjasanow. 2003. Adaptive low-rank approximation of collocation matrices. Computing 70, 1 (2003), 1--24.
[6]
Michele Benzi, Gene H Golub, and Jörg Liesen. 2005. Numerical solution of saddle point problems. Acta numerica 14, 1 (2005), 1--137.
[7]
Nicola Cancedda, Eric Gaussier, Cyril Goutte, and Jean Michel Renders. 2003. Word Sequence Kernels. Journal of Machine Learning Research 3 (March 2003), 1059--1082. htttp://dl.acm.org/citation.cfm?id=944919.944963
[8]
S. Chandrasekaran, P. Dewilde, M. Gu, and N. Somasunderam. 2010. On the Numerical Rank of the Off-Diagonal Blocks of Schur Complements of Discretized Elliptic PDEs. SIAM J. Matrix Anal. Appl. 31, 5 (2010), 2261--2290. //
[9]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011), 27.
[10]
H. Cheng, Leslie Greengard, and Vladimir Rokhlin. 1999. A Fast Adaptive Multipole Algorithm in Three Dimensions. J. Comput. Phys. 155 (1999), 468--498.
[11]
D Yu Chenhan, William B March, and George Biros. 2017. An N log N Parallel Fast Direct Solver for Kernel Matrices. In Parallel and Distributed Processing Symposium (IPDPS), 2017 IEEE International. IEEE, 886--896.
[12]
P. Coulier, H. Pouransari, and E. Darve. 2016. The inverse fast multipole method: using a fast approximate direct solver as a preconditioner for dense linear systems. ArXiv e-prints (2016). arXiv:1508.01835
[13]
Ryan R. Curtin, James R. Cline, Neil P. Slagle, William B. March, P. Ram, Nishant A. Mehta, and Alexander G. Gray. 2013. MLPACK: A Scalable C++ Machine Learning Library. Journal of Machine Learning Research 14 (2013), 801--805.
[14]
S. Dasgupta and Y. Freund. 2008. Random projection trees and low dimensional manifolds. In Proceedings of the 40th annual ACM symposium on Theory of computing. ACM, 537--546.
[15]
William Fong and Eric Darve. 2009. The black-box fast multipole method. J. Comput. Phys. 228, 23 (2009), 8712--8725.
[16]
Pieter Ghysels, Xiaoye S. Li, FranÃğois-Henry Rouet, Samuel Williams, and Artem Napov. 2016. An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling. SIAM Journal on Scientific Computing 38, 5 (2016), S358--S384.
[17]
Lars Grasedyck, Ronald Kriemann, and Sabine Le Borne. 2008. Parallel black box-LU preconditioning for elliptic boundary value problems. Computing and visualization in science 11, 4 (2008), 273--291.
[18]
A.G. Gray and A.W. Moore. 2001. N-body problems in statistical learning. Advances in neural information processing systems (2001), 521--527.
[19]
Leslie Greengard, Denis Gueyffier, Per-Gunnar Martinsson, and Vladimir Rokhlin. 2009. Fast direct solvers for integral equations in complex three-dimensional domains. Acta Numerica 18, 1 (2009), 243--275.
[20]
Greengard, L. 1994. Fast Algorithms For Classical Physics. Science 265, 5174 (1994), 909--914.
[21]
Wolfgang Hackbusch. 2015. Hierarchical Matrices: Algorithms and Analysis (1 ed.). Springer-Verlag Berlin Heidelberg.
[22]
N. Halko, P.-G. Martinsson, and J.A. Tropp. 2011. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53 (2011), 217--288.
[23]
Kenneth L Ho and Leslie Greengard. 2012. A fast direct solver for structured linear systems by recursive skeletonization. SIAM Journal on Scientific Computing 34, 5 (2012), A2507--A2532.
[24]
Thomas Hofmann, Bernhard Schölkopf, and Alexander J Smola. 2008. Kernel methods in machine learning. The annals of statistics (2008), 1171--1220.
[25]
George Karypis and Vipin Kumar. 1998. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed computing 48, 1 (1998), 96--129.
[26]
Risi Imre Kondor and John Lafferty. 2002. Diffusion kernels on graphs and other discrete input spaces. In ICML, Vol. 2. 315--322.
[27]
Edo Liberty, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert. 2007. Randomized algorithms for the low-rank approximation of matrices. Proceedings of the National Academy of Sciences 104, 51 (2007), 20167--20172.
[28]
M. Lichman. 2013. UCI Machine Learning Repository. (2013). http://archive.ics.uci.edu/ml
[29]
M.W. Mahoney and P. Drineas. 2009. CUR matrix decompositions for improved data analysis. Proceedings of the National Academy of Sciences 106, 3 (2009), 697.
[30]
William B. March, Bo Xiao, Sameer Tharakan, Chenhan D. Yu, and George Biros. 2015. A kernel-independent FMM in general dimensions. In Proceedings of SC15 (The SCxy Conference series). ACM/IEEE, Austin, Texas.
[31]
William B. March, Bo Xiao, Sameer Tharakan, Chenhan D. Yu, and George Biros. 2015. Robust treecode approximation for kernel machines. In Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Sydney, Australia, 1--10.
[32]
William B. March, Bo Xiao, Chenhan Yu, and George Biros. 2015. An algebraic parallel treecode in arbitrary dimensions. In Proceedings of IPDPS 2015 (29th IEEE International Parallel and Distributed Computing Symposium). Hyderabad, India.
[33]
William B. March, Bo Xiao, Chenhan D. Yu, and George Biros. 2016. ASKIT: An Efficient, Parallel Library for High-Dimensional Kernel Summations. SIAM Journal on Scientific Computing 38, 5 (2016), S720--S749.
[34]
Per-Gunnar Martinsson. 2016. Compressing Rank-Structured Matrices via Randomized Sampling. SIAM Journal on Scientific Computing 38, 4 (2016), A1959--A1986.
[35]
Per-Gunnar Martinsson and Vladimir Rokhlin. 2007. An accelerated kernel-independent fast multipole method in one dimension. SIAM Journal on Scientific Computing 29, 3 (2007), 1160--1178.
[36]
P.-G. Martinsson, V. Rokhlin, and M. Tygert. 2010. A randomized algorithm for the decomposition of matrices. Applied and Computational Harmonic Analysis (2010).
[37]
K. R. Muske and J. W. Howse. 2001. A Lagrangian Method for Simultaneous Nonlinear Model Predictive Control. In Large-Scale PDE-constrained Optimization: State-of-the-Art, L.T. Biegler, O. Ghattas, M. Heinkenschloss, and B. van Bloemen Waanders (Eds.). Springer-Verlag.
[38]
François-Henry Rouet, Xiaoye S. Li, Pieter Ghysels, and Artem Napov. 2016. A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization. ACM Transactions in Mathematical Software 42, 4, Article 27 (June 2016), 35 pages.
[39]
Bernhard Schölkopf and Alexander J Smola. 2002. Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.
[40]
Avinash Sodani and others. 2016. Knights Landing: Second-Generation Intel Xeon Phi Product. IEEE Micro 36, 2 (2016), 34--46.
[41]
Haluk Topcuoglu, Salim Hariri, and Min-you Wu. 2002. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE transactions on parallel and distributed systems 13, 3 (2002), 260--274.
[42]
Christopher Williams and Matthias Seeger. 2001. Using the Nyström method to speed up kernel machines. In Proceedings of the 14th Annual Conference on Neural Information Processing Systems. 682--688.
[43]
Jianlin Xia, Shivkumar Chandrasekaran, Ming Gu, and Xiaoye S Li. 2010. Fast algorithms for hierarchically semiseparable matrices. Numerical Linear Algebra with Applications 17, 6 (2010), 953--976.
[44]
Bo Xiao and George Biros. 2016. Parallel Algorithms for Nearest Neighbor Search Problems in High Dimensions. SIAM Journal on Scientific Computing 38, 5 (2016), S667--S699.
[45]
Lexing Ying, George Biros, and Denis Zorin. 2004. A kernel-independent adaptive fast multipole method in two and three dimensions. J. Comput. Phys. 196, 2 (2004), 591--626.
[46]
Rio Yokota, Huda Ibeid, and David Keyes. 2016. Fast Multipole Method as a Matrix-Free Hierarchical Low-Rank Approximation. Computing Research Repository abs/1602.02244 (2016). http://arxiv.org/abs/1602.02244
[47]
Chenhan D Yu, Jianyu Huang, Woody Austin, Bo Xiao, and George Biros. 2015. Performance optimization for the k-nearest neighbors kernel on x86 architectures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 7.
[48]
Chenhan D. Yu, William B. March, Bo Xiao, and George Biros. 2016. INV-ASKIT: A Parallel Fast Direct Solver for Kernel Matrices. In Proceedings of the IPDPS16. Chicago, USA.

Cited By

View all
  • (2024)Multiresolution kernel matrix algebraNumerische Mathematik10.1007/s00211-024-01409-8Online publication date: 9-May-2024
  • (2024)Efficient and Scalable Kernel Matrix Approximations Using Hierarchical DecompositionIntelligent Computers, Algorithms, and Applications10.1007/978-981-97-0065-3_1(3-16)Online publication date: 28-Jan-2024
  • (2023)Correlation-based sparse inverse Cholesky factorization for fast Gaussian-process inferenceStatistics and Computing10.1007/s11222-023-10231-533:3Online publication date: 28-Mar-2023
  • Show More Cited By

Index Terms

  1. Geometry-oblivious FMM for compressing dense SPD matrices

                    Recommendations

                    Comments

                    Please enable JavaScript to view thecomments powered by Disqus.

                    Information & Contributors

                    Information

                    Published In

                    cover image ACM Conferences
                    SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
                    November 2017
                    801 pages
                    ISBN:9781450351140
                    DOI:10.1145/3126908
                    • General Chair:
                    • Bernd Mohr,
                    • Program Chair:
                    • Padma Raghavan
                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Sponsors

                    In-Cooperation

                    • IEEE CS

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    Published: 12 November 2017

                    Permissions

                    Request permissions for this article.

                    Check for updates

                    Author Tags

                    1. fast matrix multiplication
                    2. fast multipole methods
                    3. geometry-oblivious
                    4. heterogeneous computing
                    5. hierarchical matrices

                    Qualifiers

                    • Research-article

                    Conference

                    SC '17
                    Sponsor:

                    Acceptance Rates

                    SC '17 Paper Acceptance Rate 61 of 327 submissions, 19%;
                    Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

                    Upcoming Conference

                    Contributors

                    Other Metrics

                    Bibliometrics & Citations

                    Bibliometrics

                    Article Metrics

                    • Downloads (Last 12 months)20
                    • Downloads (Last 6 weeks)0
                    Reflects downloads up to 07 Jan 2025

                    Other Metrics

                    Citations

                    Cited By

                    View all
                    • (2024)Multiresolution kernel matrix algebraNumerische Mathematik10.1007/s00211-024-01409-8Online publication date: 9-May-2024
                    • (2024)Efficient and Scalable Kernel Matrix Approximations Using Hierarchical DecompositionIntelligent Computers, Algorithms, and Applications10.1007/978-981-97-0065-3_1(3-16)Online publication date: 28-Jan-2024
                    • (2023)Correlation-based sparse inverse Cholesky factorization for fast Gaussian-process inferenceStatistics and Computing10.1007/s11222-023-10231-533:3Online publication date: 28-Mar-2023
                    • (2022)Solving Linear Systems on a GPU with Hierarchically Off-Diagonal Low-Rank ApproximationsSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00089(1-15)Online publication date: Nov-2022
                    • (2022)High-Performance Spatial Data Compression for Scientific ApplicationsEuro-Par 2022: Parallel Processing10.1007/978-3-031-12597-3_25(403-418)Online publication date: 22-Aug-2022
                    • (2020)MatRoxProceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3332466.3374548(389-402)Online publication date: 19-Feb-2020
                    • (2020)Scalable and Memory-Efficient Kernel Ridge Regression2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00102(956-965)Online publication date: May-2020
                    • (2019)Distributed-memory lattice H-matrix factorizationThe International Journal of High Performance Computing Applications10.1177/1094342019861139(109434201986113)Online publication date: Aug-2019
                    • (2019)Distributed O(N) Linear Solver for Dense Symmetric Hierarchical Semi-Separable Matrices2019 IEEE 13th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)10.1109/MCSoC.2019.00008(1-8)Online publication date: Oct-2019
                    • (2018)Distributed-memory hierarchical compression of dense SPD matricesProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291676(1-15)Online publication date: 11-Nov-2018
                    • Show More Cited By

                    View Options

                    Login options

                    View options

                    PDF

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader

                    Media

                    Figures

                    Other

                    Tables

                    Share

                    Share

                    Share this Publication link

                    Share on social media