Abstract
Ensemble Kalman inversion (EKI) has been a very popular algorithm used in Bayesian inverse problems (Iglesias et al. in Inverse Probl 29: 045001, 2013). It samples particles from a prior distribution and introduces a motion to move the particles around in pseudo-time. As the pseudo-time goes to infinity, the method finds the minimizer of the objective function, and when the pseudo-time stops at 1, the ensemble distribution of the particles resembles, in some sense, the posterior distribution in the linear setting. The ideas trace back further to ensemble Kalman filter and the associated analysis (Evensen in J Geophys Res: Oceans 99: 10143–10162, 1994; Reich in BIT Numer Math 51: 235–249, 2011), but to today, when viewed as a sampling method, why EKI works, and in what sense with what rate the method converges is still largely unknown. In this paper, we analyze the continuous version of EKI, a coupled SDE system, and prove the mean-field limit of this SDE system. In particular, we will show that 1. as the number of particles goes to infinity, the empirical measure of particles following SDE converges to the solution to a Fokker–Planck equation in Wasserstein 2-distance with an optimal rate, for both linear and weakly nonlinear case; 2. the solution to the Fokker–Planck equation reconstructs the target distribution in finite time in the linear case, as suggested in Iglesias et al. (Inverse Probl 29: 045001, 2013).
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bergemann, K., Reich, S.: A localization technique for ensemble Kalman filters. Q J R Meteorol Soc 136(648), 701–707 (2010a)
Bergemann, K., Reich, S.: A mollified ensemble Kalman filter. Q J R Meteorol Soc 136(651), 1636–1643 (2010b)
Bloemker, D., Schillings, C., Wacker, P., Weissmann, S.: Well posedness and convergence analysis of the ensemble Kalman inversion. Inverse Probl 35(8), 085007 (2019)
Blomker, D., Schillings, C., Wacker, P.: A strongly convergent numerical scheme from ensemble kalman inversion. SIAM J Numer Anal 56(4), 2537–2562 (2018)
Bolley, F., Cañizo, J.A., Carrillo, J.A.: Stochastic mean-field limit: non-Llipschitz forces and swarming. Math. Models Methods Appl. Sci. 21(11), 2179–2210 (2011)
Cañizo, J.A., Carrillo, J.A., Rosado, J.: A well-posedness theory in measures for some kinetic models of collective motion. Math. Models Methods Appl. Sci. 21(03), 515–539 (2011)
Craig, K., Bertozzi, A.: A blob method for the aggregation equation. Math. Comput. 85, 05 (2014)
Dashti, M., Stuart, A.M.: The Bayesian Approach to Inverse Problems. Springer, Cham (2017)
De Moral, P.: Feynman-Kac Formulae: Genealogical and Interacting Particle Approximations. Springer, Berlin (2004)
Ding, Z., Li, Q.: Ensemble kalman sampling: mean-field limit and convergence analysis. arXiv: 1910.12923, (2019)
Ding, Z., Li, Q., Lu, J.: Ensemble Kalman inversion for nonlinear problems: weights, consistency, and variance bounds. Foundations of Data Science, (2020)
Doucet, A., de Freitas, N., Gordon, N.: An Introduction to Sequential Monte Carlo Methods. Springer, New York (2001)
Ernst, O.G., Sprungk, B., Starkloff, H.-J.: Analysis of the ensemble and polynomial chaos Kalman filters in Bayesian inverse problems. SIAM/ASA J Uncertain Quantif 3(1), 823–851 (2015). https://doi.org/10.1137/140981319
Evensen, G.: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res.: Oceans 99(C5), 10143–10162 (1994)
Evensen, G.: The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dyn. 53(4), 343–367 (2003)
Evensen, G.: Data Assimilation-The Ensemble Kalman Filter. Springer, New York (2006)
Fournier, N., Guillin, A.: On the rate of convergence in Wasserstein distance of the empirical measure. Probabl Theory Related Fields 162(3), 707–738 (2015)
Garbuno-Inigo, A., Hoffmann, F., Li, W., Stuart, A.M.: Interacting Langevin diffusions: gradient structure and ensemble Kalman sampler. SIAM J. Appl. Dyn. Syst. 19(1), 412–441 (2020)
Ghil, M., Cohn, S., Tavantzis, J., Bube, K., Isaacson, E.: Applications of Estimation Theory to Numerical Weather Prediction. Springer, New York (1981)
Herty, M., Visconti, G.: Kinetic methods for inverse problems. Kinet Relat Models 12, 1109 (2019)
Houtekamer, P.L., Mitchell, Herschel L: A sequential ensemble Kalman filter for atmospheric data assimilation. Mon. Weather Rev. 129(1), 123–137 (2001)
Iglesias, M.A., Law, K., Stuart, A.M.: Ensemble Kalman methods for inverse problems. Inverse Probl. 29(4), 045001 (2013)
Lange, T., Stannat, W.: On the continuous time limit of the ensemble Kalman filter. Math. Comp. 90 , no. 327, 233–265. (2021)
Law, K.J.H., Tembine, H., Tempone, R.: Deterministic mean-field ensemble Kalman filtering. SIAM J. Sci. Comput. 38(3), A1251–A1279 (2016). https://doi.org/10.1137/140984415
Le Gland, F., Monbet, V., V.: Tran. Large sample asymptotics for the ensemble kalman filter, Handbook on Nonlinear Filtering (2011)
Liu, Q., Wang, D.: Stein variational gradient descent: A general purpose Bayesian inference algorithm. Adv. Neural Inf. Process. Syst. 29, 2378–2386 (2016)
Lu, J., Lu, Y., Nolen, J.: Scaling limit of the stein variational gradient descent: the mean field regime. SIAM J. Math. Anal. 51(2), 648–671 (2019a)
Lu, Y., Lu, J., Nolen, J.: Accelerating langevin sampling with birth-death. arXiv:1905.09863 (2019)
Pavon, M., Tabak, E.G., Trigila, G.: The data-driven schroedinger bridge. arXiv:1806.01364, (2018)
Reich, S.: A dynamical systems framework for intermittent data assimilation. BIT Numer. Math. 51(1), 235–249 (2011)
Reich, S.: Data assimilation: the schrödinger perspective. Acta Numerica 28, 635–711 (2019)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, New York (2004)
Schillings, C., Stuart, A.M.: Analysis of the ensemble Kalman filter for inverse problems. SIAM J. Numer. Anal 55(3), 1264–1290 (2017)
Schillings, C., Stuart, A.M.: Convergence analysis of ensemble Kalman inversion: the linear, noisy case. Appl. Anal. 97(1), 107–123 (2018)
Stuart, A.M.: Inverse problems: A Bayesian perspective. Acta Numerica 19, 451–559 (2010)
Sznitman, A.: Topics in propagation of chaos. In: Ecole d’Eté de Probabilités de Saint-Flour XIX — 1989, pp. 165–251. Springer Berlin Heidelberg, (1991)
Acknowledgements
The research of Q.L. and Z.D. was supported in part by National Science Foundation under award 1619778, 1750488 and Wisconsin Data Science Initiative. Both authors would like to thank Andrew Stuart for the helpful discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix A: Moments bound of summation of independent mean-zero random variables
In this section, we prove a lemma which is used in proof of Lemma 3.
Lemma 9
Assume \(x_1,\cdots ,x_J\) are i.i.d random variables and satisfy (for \(p\ge 2\))
Then, we have
where C is a constant only depends on \({\mathcal {L}}_p\) and p.
Proof
Without loss of generality, we assume p is an even number and \(J>p/2\). Then \({\mathbb {E}}\left| \sum ^J_{j=1}x_j\right| ^p={\mathbb {E}}\left( \sum ^J_{j=1}x_j\right) ^{p}\).
Since \(\{x_i\}\) are independent with zero mean, we have
where \(\{j_n\}\) should be nonnegative integers and not equal to 1 (otherwise \({\mathbb {E}}x_i=0\) provides a trivial contribution).
For each term in the summation, using generalization of Hölder’s inequality, we have
which implies
where
and \(|I_1|\) denotes the cardinality of the set \(I_1\).
In \(I_1\), if \(j_n\) does not equal to zero, then \(j_n\) is at least 2, meaning there are at most p/2 nontrivial elements in the vector. Therefore, we have the following inequality
Here P(J, p/2) denotes the number of p/2-permutations in J and is thus smaller than \(J^{p/2}\), and \(I_{2}\) is a new set defined by:
Its cardinality does not have J dependence, and thus, we bound it by C(p), a constant depending on p only. \(\square \)
Appendix B: Bound of high moments of \(\{u^j\}\)
Proof
For convenience, we omit the subscript \('t'\) in \(u,\mathbf{u },e,\mathbf{e }\), etc. First, we prove the boundedness of \({\mathbb {E}}\left[ \frac{1}{J}\sum ^J_{j}|e^j|^2\right] ^p\), which we will use later.
which also implies
Then, we first estimate \({\mathbb {E}}|\mathbf{u }^j|^{2p}\). Using Ito’s formula, for fix \(1\le j\le J\) and \(p\ge 1\), we obtain
where \(\mathrm {\mathbf{R }}\) is the coefficient before Brownian motion. The first term is negative. To complete the computation, we need to provide the bound for the rest. The second term is bounded by:
The third term is bounded by:
And similarly, the rests are bounded by:
and
and
Plugging all these inequalities back in (65) and utilizing (64), we have:
Then, to deal with \({\mathbb {E}}|u^j|^{2p}\), we use Ito’s formula similarly, for fix \(1\le j\le J\) and \(p\ge 1\), we obtain
where \(\mathrm {R}\) is the coefficient before Brownian motion. The six terms are considered separately:
-
Term 1
$$\begin{aligned} \begin{aligned}&\left| {\mathbb {E}}\left( |u^j|^{2(p-1)}\left\langle u^j,\mathrm {Cov}_{u,\mathbf{u }}\mathbf{u }^j\right\rangle \right) \right| \\&\quad \le {\mathbb {E}}\left( |u^j|^{2p-\frac{1}{2}}\frac{1}{J}\sum ^J_{k=1}|e^k||\mathbf{e }^k||\mathbf{u }^j|\right) \\&\quad \le \left( {\mathbb {E}}|u^j|^{2p}\right) ^{(2p-\frac{1}{2})/(2p)}\left( {\mathbb {E}}\left( \frac{1}{J}\sum ^J_{k=1}|e^k||\mathbf{e }^k||\mathbf{u }^j|\right) ^{4p}\right) ^{1/(4p)}\\&\quad \le C\left( {\mathbb {E}}|u^j|^{2p}\right) ^{(2p-\frac{1}{2})/(2p)}\,, \end{aligned} \end{aligned}$$
where in the last inequality we use (63), (64) and (66) with Hölder’s inequality.
-
Term 2
$$\begin{aligned}&\left| {\mathbb {E}}\left( |u^j|^{2(p-1)}\left[ \frac{1}{J^2}\sum ^J_{i,k=1}\left\langle e^i,e^k\right\rangle \left\langle \mathbf{e }^k,\mathbf{e }^i\right\rangle \right] \right) \right| \\&\quad \le C{\mathbb {E}}\left( |u^j|^{2(p-1)}\left[ \frac{1}{J^2}\sum ^J_{i,k=1}|e^i| |e^k||\mathbf{e }^i||\mathbf{e }^k|\right] \right) \\&\quad \le C{\mathbb {E}}\left( |u^j|^{2(p-1)}\left( \frac{1}{J}\sum ^J_{i=1}|e^i|^2\right) \left( \frac{1}{J}\sum ^J_{k=1}|e^k|^2\right) \right) \\&\quad \le C{\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p} \left( {\mathbb {E}}\left[ \frac{1}{J}\sum ^J_{i=1}|e^i|^2\right] ^{2p}\right) ^{1/p}\\&\quad \le C{\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p} \left( {\mathbb {E}}\left[ \sum ^K_{m=1}\frac{1}{J}\sum ^J_{i=1}|e^i_m|^2\right] ^{2p}\right) ^{1/p}\\&\quad \le C{\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p} \left( {\mathbb {E}}\sum ^K_{m=1}\left[ \frac{1}{J}\sum ^J_{i=1}|e^i_m|^2\right] ^{2p}\right) ^{1/p}\\&\quad \le CV_{4p}^{1/p}(e_0){\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p}\,. \end{aligned}$$
-
Term 3
$$\begin{aligned} \begin{aligned}&\left| {\mathbb {E}}\left( \left| u^j\right| ^{2(p-2)}\left[ \frac{1}{J^2}\sum ^J_{i,k=1}\left\langle u^j,e^i\right\rangle \left\langle u^j,e^k\right\rangle \left\langle \mathbf{e }^i,\mathbf{e }^k\right\rangle \right] \right) \right| \\&\le \,C{\mathbb {E}}\left( |u^j|^{2(p-1)}\left[ \frac{1}{J^2}\sum ^J_{i,k=1}|e^i| |e^k||\mathbf{e }^i||\mathbf{e }^k|\right] \right) \\&\le CV_{4p}^{1/p}(e_0){\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p}\,. \end{aligned} \end{aligned}$$
-
Term 4
$$\begin{aligned} \begin{aligned}&\left| {\mathbb {E}}\left( |u^j|^{2(p-1)}\left\langle u^j,\mathrm {Cov}_{u,\mathbf{r }}\varGamma ^{-\frac{1}{2}}\left( r-\mathrm {m}(u)\right) \right\rangle \right) \right| \\ \le&\quad M^2{\mathbb {E}}\left( |u^j|^{2p-\frac{1}{2}}\frac{1}{J}\sum ^J_{k=1}|e^k|\right) \\ \le&\quad \left( {\mathbb {E}}|u^j|^{2p}\right) ^{(2p-\frac{1}{2})/(2p)}\left( {\mathbb {E}}\left( \frac{1}{J}\sum ^J_{k=1}|e^k|\right) ^{4p}\right) ^{1/(4p)}\\ \le&C\left( {\mathbb {E}}|u^j|^{2p}\right) ^{(2p-\frac{1}{2})/(2p)}\,, \end{aligned} \end{aligned}$$
where in the last inequality we use (63) and (66) with Hölder’s inequality.
-
Term 5
$$\begin{aligned} \begin{aligned}&\left| {\mathbb {E}}\left( |u^j|^{2(p-1)}\left[ \frac{1}{J^2}\sum ^J_{i,k=1}\left\langle e^i,e^k\right\rangle \left\langle \mathbf{r }^k,\mathbf{r }^i\right\rangle \right] \right) \right| \\ \le&\ CM^2{\mathbb {E}}\left( |u^j|^{2(p-1)}\left( \frac{1}{J}\sum ^J_{i=1}|e^i|^2\right) \right) \\ \le&\ C{\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p} \left( {\mathbb {E}}\left[ \frac{1}{J}\sum ^J_{i=1}|e^i|^2\right] ^{p}\right) ^{1/p}\\ \le&\ C{\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p} \left( {\mathbb {E}}\left[ \sum ^K_{m=1}\frac{1}{J}\sum ^J_{i=1}|e^i_m|^2\right] ^{p}\right) ^{1/p}\\ \le&\ C{\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p} \left( {\mathbb {E}}\sum ^K_{m=1}\left[ \frac{1}{J}\sum ^J_{i=1}|e^i_m|^2\right] ^{p}\right) ^{1/p}\\ \le&\ CV_{2p}^{1/p}(e_0){\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p}\,. \end{aligned} \end{aligned}$$
-
Term 6
$$\begin{aligned} \begin{aligned}&\left| {\mathbb {E}}\left( \left| u^j\right| ^{2(p-2)}\left[ \frac{1}{J^2}\sum ^J_{i,k=1}\left\langle u^j,e^i\right\rangle \left\langle u^j,e^k\right\rangle \left\langle \mathbf{r }^i,\mathbf{r }^k\right\rangle \right] \right) \right| \\ \le&\,C{\mathbb {E}}\left( |u^j|^{2(p-1)}\left[ \frac{1}{J^2}\sum ^J_{i,k=1}|e^i| |e^k||\mathbf{r }^i||\mathbf{r }^k|\right] \right) \\ \le&\,CV_{2p}^{1/p}(e_0){\mathbb {E}}\left( |u^j|^{2p}\right) ^{(p-1)/p}\,. \end{aligned} \end{aligned}$$
By Lemma 4, we obtain the boundedness for \({\mathbb {E}}\left\| u^j\right\| ^{2p}_2\) . Then, to prove the second inequality of (45), it suffices to prove
which is a direct result by expansion of \(\mathrm {Cov}_{u_t}\) and triangle inequality:
Here the last inequality that comes from each term of the sum has a bound
\(\square \)
Rights and permissions
About this article
Cite this article
Ding, Z., Li, Q. Ensemble Kalman inversion: mean-field limit and convergence analysis. Stat Comput 31, 9 (2021). https://doi.org/10.1007/s11222-020-09976-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-020-09976-0