Annex: Consistency of CL-SGD with the Sketch Matching Problem
In this section, we give the necessary lemmas to link our stochastic descent directions with the original sketch matching problem (summarized by Lemmas of Sect. 3.2). The central idea of our method is that we can generally approximate \(S\mu _\theta \) with \(B_{\textbf{p}}\mu _\theta ({\textbf{p}})\), which will translate to the chosen stochastic gradients.
Lemma 6.1
Consider \({\mathcal {S}}\) constructed with frequencies \((\omega _l)_{l=1}^m\). Let \(B_{\textbf{p}} \in {\mathbb {C}}^{m\times P}\) with general term \(B_{{\textbf{p}},l,i} = \frac{e^{-j \langle \omega _l,p_i \rangle }}{P}\) and \(\mu \in {\mathcal {M}}({\mathcal {D}})\). Then,
$$\begin{aligned} {\mathbb {E}}_{\textbf{p}} \left( B_{\textbf{p}}\mu ({\textbf{p}})\right) =S\mu . \end{aligned}$$
(34)
Proof
The expectation yields for \(l \in \{1,\ldots ,m \}\)
$$\begin{aligned} \begin{aligned} {[}{\mathbb {E}}_{\textbf{p}}(B_{\textbf{p}}\mu ({\textbf{p}}))]_l&= {\mathbb {E}}_{\textbf{p}} \left( \sum _{r=1}^P \frac{e^{-j \langle \omega _l,p_r\rangle }\mu (p_r) }{P}\right) . \end{aligned} \end{aligned}$$
(35)
As the \(p_r\) are i.i.d and \({{\mathcal {D}} = [0,1]^d}\), we have \({\int _{p_1 \in {\mathcal {D}}}d p_1= 1}\) and
$$\begin{aligned} \begin{aligned} {[}{\mathbb {E}}_{\textbf{p}}(B_{\textbf{p}}\mu ({\textbf{p}}))]_l&= P {\mathbb {E}}_{\textbf{p}} \left( \frac{e^{-j \langle \omega _l,p_1\rangle } \mu (p_1) }{P} \right) \\&= \frac{\int _{p_1 \in {\mathcal {D}}} e^{-j \langle \omega _l,p_1\rangle } \mu (p_1) d p_1}{{\int _{p_1 \in {\mathcal {D}}}d p_1}}= [S\mu ]_l. \end{aligned} \end{aligned}$$
(36)
\(\square \)
This shows that on average, random discretization of the data domain for the forward sketching operator is consistent with the original sketch.
To calculate the expectation of our stochastic gradients, we provide the following Lemma which gives the expectation of the discretized cross-product between two measures. We write \(\langle \mu _1, \mu _2 \rangle _{L^2({\mathcal {D}})}:= \int _{{\mathcal {D}}}\mu _1(x)\mu _2(x)dx \) the cross-product between two densities \(\mu _1\) and \(\mu _2\).
Lemma 6.2
Consider \({\mathcal {S}}\) constructed with frequencies \((\omega _l)_{l=1}^m\). Let \(B_{\textbf{p}} \in {\mathbb {C}}^{m\times P}\) with general term \(B_{{\textbf{p}},l,i} = \frac{e^{-j \langle \omega _l,p_i \rangle }}{P}\) and \(\mu _1,\mu _2 \in {\mathcal {M}}({\mathcal {D}})\). We have
$$\begin{aligned} \begin{aligned} {\mathbb {E}}_{\textbf{p}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{p}}\mu _2({\textbf{p}}) \rangle&= \frac{m}{P} \langle \mu _1, \mu _2 \rangle _{L^2({\mathcal {D}})} \\&\quad +\frac{P-1}{P}\langle {\mathcal {S}}\mu _1, {\mathcal {S}}\mu _2 \rangle .\\ \end{aligned} \end{aligned}$$
(37)
Proof
We have
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{\textbf{p}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{p}}\mu _2({\textbf{p}}) \rangle \\&\quad = {\mathbb {E}}_{\textbf{p}} (\mu _2({\textbf{p}})^T B_{\textbf{p}} ^* B_{\textbf{p}}\mu _1({\textbf{p}}) )\\&\quad =\frac{1}{P^2} {\mathbb {E}}_{\textbf{p}} \Bigl ( \sum _{t=1}^P \mu _2 (p_t)\sum _{g=1}^m e^{j\langle \omega _g,p_t\rangle } \sum _{r=1}^P e^{-j \langle \omega _g,p_r\rangle } \mu _1(p_r)\Bigr )\\&\quad =\frac{1}{P^2} \sum _{t=1}^P \sum _{g=1}^m\sum _{r=1}^P {\mathbb {E}}_{\textbf{p}} \left( e^{j\langle \omega _g,p_t-p_r\rangle }\mu _2 (p_t)\mu _1(p_r)\right) . \end{aligned}\nonumber \\ \end{aligned}$$
(38)
The diagonal terms in the double sum over t and r, i.e., where \(p_t = p_r\), are
$$\begin{aligned} \begin{aligned} D&= \frac{1}{P^2} \sum _{t=1}^P \sum _{g=1}^m {\mathbb {E}}_{\textbf{p}}\left( \mu _2 (p_t)\mu _1(p_t)\right) = \frac{m}{P} \langle \mu _1, \mu _2 \rangle _{L^2({\mathcal {D}})}. \end{aligned}\nonumber \\ \end{aligned}$$
(39)
The non-diagonal terms \(p_t \ne p_r\) give (with the fact that the \(p_i\) are i.i.d.):
$$\begin{aligned} \begin{aligned} N&=\frac{1}{P^2} \sum _{t=1}^P \sum _{g=1}^m\sum _{r=1, r\ne t}^P {\mathbb {E}}_{\textbf{p}}\left( e^{j\langle \omega _g,p_t-p_r\rangle }\mu _2 (p_t)\mu _1(p_r)\right) \\&=\frac{P-1}{P} \sum _{g=1}^m \left( {\mathbb {E}}_{\textbf{p}} e^{j\langle \omega _g,p_1\rangle }\mu _2 (p_1)\right) \left( {\mathbb {E}}_{\textbf{p}}e^{-j \langle \omega _g,p_1\rangle } \mu _1(p_1)\right) \\&=\frac{P-1}{P} \sum _{g=1}^m ({\mathcal {S}}\mu _2)_g^*({\mathcal {S}}\mu _1)_g \\&= \frac{P-1}{P}\langle {\mathcal {S}}\mu _1, {\mathcal {S}}\mu _2 \rangle . \end{aligned}\nonumber \\ \end{aligned}$$
(40)
\(\square \)
We also calculate the variance of the unbiased estimator of the gradient of G thanks to the following Lemma.
Lemma 6.3
Consider \({\mathcal {S}}\) constructed with frequencies \((\omega _l)_{l=1}^m\). Let \(B_{\textbf{p}} \in {\mathbb {C}}^{m\times P}\) with general term \(B_{p,l,i} = \frac{e^{-j \langle \omega _l,p_i \rangle }}{P}\) and \(\mu _1,\mu _2 \in {\mathcal {M}}({\mathcal {D}})\). We have
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \vert ^2 \\&\qquad - \vert {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \vert ^2 \\&\quad = \frac{1}{P^2} \langle \vert \mu _1\vert ^2,\vert \mu _2 \vert ^2\rangle _{L^2,\vert {\mathcal {S}}^* {\textbf{1}} \vert ^2} + \frac{1}{P} C(\mu _1,\mu _2,z) \end{aligned} \end{aligned}$$
(41)
where we define \({\mathcal {S}}^*z: p \rightarrow \sum _g z_ge^{j \langle \omega _g,p \rangle }\), and for a kernel h (a function from \({\mathcal {D}}\) to \({\mathbb {R}}\)) and two measures \(\nu _1, \nu _2\), \(\langle \nu _1, \nu _2 \rangle _{L^2({\mathcal {D}}),h}:= \int _{x,y} \nu _1(x)\nu _2(y)h(x-y) dxdy\), where
$$\begin{aligned} \begin{aligned}&C(\mu _1,\mu _2,z)\\&\quad := \frac{P-1}{P} ( \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _1 \vert ^2, \vert \mu _2 \vert ^2\rangle _{L^2} + \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _2 \vert ^2, \vert \mu _1 \vert ^2\rangle _{L^2}) \\&\qquad + {\mathcal {R}}e\langle \vert {\mathcal {S}}^*z\vert ^2 -2{\mathcal {S}}^*z({\mathcal {S}}^*{\mathcal {S}}\mu _2)^*, \vert \mu _1 \vert ^2\rangle _{L^2}\\&\qquad + 2{\mathcal {R}}e\left( \langle {\mathcal {S}}\mu _1,z \rangle \langle {\mathcal {S}}\mu _1,{\mathcal {S}}\mu _2 \rangle ^*\right) - \vert \langle {\mathcal {S}}\mu _1,z \rangle \vert ^2 \\&\qquad - \frac{2P-1}{P} \vert \langle {\mathcal {S}}\mu _1,{\mathcal {S}}\mu _2\rangle \vert ^2. \end{aligned} \end{aligned}$$
(42)
Proof
We need to calculate a few terms separately. For \(y\in {\mathbb {C}}^m\), using the fact that
$$\begin{aligned}{} & {} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z\rangle \\{} & {} \quad =\sum _{g=1}^m\sum _{t=1}^P e^{j\langle \omega _g,p_t\rangle }\mu _1 (p_t)z_g, \end{aligned}$$
we have
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z\rangle \langle B_{\textbf{p}}\mu _1({\textbf{p}}), y\rangle ^* \\&\quad =\frac{1}{P^2} \sum _{g,t} \sum _{{\tilde{g}},{\tilde{t}}} {\mathbb {E}}_{{\textbf{p}}} \Big (e^{j\langle \omega _g,p_t\rangle -j\langle \omega _{{\tilde{g}}},p_{{\tilde{t}}}\rangle } \mu _1 (p_t) \mu _1 (p_{{\tilde{t}}})z_g y_{{\tilde{g}}}^*\Big ) \\&\quad =\frac{1}{P^2} \sum _{g,{\tilde{g}}} z_g y_{{\tilde{g}}}^* \sum _{t,{\tilde{t}}} {\mathbb {E}}_{{\textbf{p}}}\Big ( e^{j\langle \omega _g,p_t\rangle -j\langle \omega _{{\tilde{g}}},p_{{\tilde{t}}}\rangle }\mu _1 (p_t) \mu _1 (p_{{\tilde{t}}})\Big ). \\ \end{aligned}\nonumber \\ \end{aligned}$$
(43)
As the \(p_t\) are i.i.d., we have
$$\begin{aligned} \begin{aligned}&\sum _{t,{\tilde{t}}} {\mathbb {E}}_{{\textbf{p}}} e^{j\langle \omega _g,p_t\rangle -j\langle \omega _{{\tilde{g}}},p_{{\tilde{t}}}\rangle }\mu _1 (p_t) \mu _1 (p_{{\tilde{t}}}) \quad \\&\quad = P {\mathbb {E}}_{{\textbf{p}}} [e^{-j\langle \omega _{{\tilde{g}}}-\omega _g,p_1\rangle } \vert \mu _1 (p_1)\vert ^2]\\&\qquad + P(P-1) ({\mathcal {S}}\mu _1)_g^*({\mathcal {S}}\mu _1)_{{\tilde{g}}}. \end{aligned} \end{aligned}$$
(44)
We obtain
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z\rangle \langle B_{\textbf{p}}\mu _1({\textbf{p}}), y\rangle ^*\\&\quad = \frac{1}{P^2} \sum _{g,{\tilde{g}}} \Big ( z_g y_{{\tilde{g}}}^* P {\mathbb {E}}_{{\textbf{p}}} [e^{-j\langle \omega _{{\tilde{g}}}-\omega _g,p_{1}\rangle } \vert \mu _1 (p_1)\vert ^2] \\&\qquad + P(P-1) ({\mathcal {S}}\mu _1)_g^*({\mathcal {S}}\mu _1)_{{\tilde{g}}}z_g y_{{\tilde{g}}}^* \Big )\\&\quad = \frac{1}{P}{\mathbb {E}}_{{\textbf{p}}} [ {|} B_{\textbf{p}}^*z (B_{\textbf{p}}^*y)^*] (p_1) \vert ^2 \vert \mu _1 (p_1)\vert ^2]\\&\qquad +\frac{P-1}{P} \langle {\mathcal {S}}\mu _1,z \rangle \langle {\mathcal {S}}\mu _1,y \rangle ^*\\&\quad = \frac{1}{P}\langle {\mathcal {S}}^*z({\mathcal {S}}^*y)^*, \vert \mu _1 \vert ^2\rangle _{L^2}+\frac{P-1}{P} \langle {\mathcal {S}}\mu _1,z \rangle \langle {\mathcal {S}}\mu _1,y \rangle ^* \end{aligned} \end{aligned}$$
(45)
where \({\mathcal {S}}^*z\) is defined in the hypotheses of the Lemma. For \(y= z\), we obtain
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z \rangle \vert ^2 \\&\quad = \frac{1}{P}\langle \vert {\mathcal {S}}^*z\vert ^2, \vert \mu _1 \vert ^2\rangle _{L^2}+\frac{P-1}{P} \vert \langle {\mathcal {S}}\mu _1,z \rangle \vert ^2. \end{aligned} \end{aligned}$$
(46)
We now calculate the following expectation:
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) \rangle \vert ^2 \\&\quad =\frac{1}{P^4} {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \sum _{g=1}^m\sum _{t=1} \sum _{r=1} e^{j\langle \omega _g,p_t-q_r\rangle }\mu _1 (p_t)\mu _2(q_r) \vert ^2\\&\quad =\frac{1}{P^4} \sum _{g,t,r} \sum _{{\tilde{g}},{\tilde{t}},{\tilde{r}}} {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}}\Big ( e^{j\langle \omega _g,p_t-q_r\rangle }e^{-j\langle \omega _{{\tilde{g}}},p_{{\tilde{t}}}-q_{{\tilde{r}}}\rangle }\\&\quad \; \mu _1 (p_t)\mu _2(q_r) \mu _1 (p_{{\tilde{t}}})\mu _2(q_{{\tilde{r}}})\Big ). \end{aligned} \end{aligned}$$
(47)
As \({\textbf{p}}\) and \({\textbf{q}}\) are i.i.d.,
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) \rangle \vert ^2 \\&=\frac{1}{P^4} \sum _{g,t,r} \sum _{{\tilde{g}},{\tilde{t}},{\tilde{r}}} \Big ( {\mathbb {E}}_{{\textbf{p}}} e^{j\langle \omega _g,p_t\rangle -j\langle \omega _{{\tilde{g}}},p_{{\tilde{t}}}\rangle } \mu _1 (p_t) \mu _1 (p_{{\tilde{t}}})\Big ) \\&\quad \;{\mathbb {E}}_{{\textbf{q}}}\Big (e^{-j\langle \omega _g,q_r\rangle +j\langle \omega _{{\tilde{g}}},q_{{\tilde{r}}}\rangle } \mu _2(q_r) \mu _2(q_{{\tilde{r}}})\Big )\\&=\frac{1}{P^4} \sum _{g,{\tilde{g}}} \sum _{t,{\tilde{t}}} {\mathbb {E}}_{{\textbf{p}}}\left( e^{j\langle \omega _g,p_t\rangle -j\langle \omega _{{\tilde{g}}},p_{{\tilde{t}}}\rangle } \mu _1 (p_t) \mu _1 (p_{{\tilde{t}}}) \right) \\&\quad \;\sum _{r,{\tilde{r}}}{\mathbb {E}}_{{\textbf{q}}}\left( e^{-j\langle \omega _g,q_r\rangle +j\langle \omega _{{\tilde{g}}},q_{{\tilde{r}}}\rangle } \mu _2(q_r) \mu _2(q_{{\tilde{r}}})\right) \\&= \frac{1}{P^4} \sum _{g,{\tilde{g}}} A_{1,g,{\tilde{g}}} A_{2,g,{\tilde{g}}}^* \\ \end{aligned} \end{aligned}$$
(48)
where with the decomposition of the sum into diagonal and off-diagonal terms,
$$\begin{aligned} \begin{aligned} A_{i,g,{\tilde{g}}}&= \sum _{t,{\tilde{t}}} {\mathbb {E}}_{{\textbf{p}}} \left( e^{j\langle \omega _g,p_t\rangle -j\langle \omega _{{\tilde{g}}},p_{{\tilde{t}}}\rangle } \mu _i (p_t) \mu _i (p_{{\tilde{t}}})\right) \\&= P {\mathbb {E}}_{{\textbf{p}}} \left( e^{-j\langle \omega _{{\tilde{g}}}-\omega _g,p_{t}\rangle } \vert \mu _i (p_t)\vert ^2 \right) \\&\quad \;+ P(P-1) ({\mathcal {S}}\mu _i)_g^*({\mathcal {S}}\mu _i)_{{\tilde{g}}}. \end{aligned} \end{aligned}$$
(49)
We obtain
$$\begin{aligned} \begin{aligned}&P^2{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) \rangle \vert ^2 \\&\quad = \sum _{g,{\tilde{g}}} {\mathbb {E}}_{{\textbf{p}}} (e^{-j\langle \omega _{{\tilde{g}}}-\omega _g,p_{t}\rangle } \vert \mu _1 (p_t)\vert ^2 ) {\mathbb {E}}_{{\textbf{q}}} (e^{j\langle \omega _{{\tilde{g}}}-\omega _g,q_{r}\rangle } \vert \mu _2(q_r)\vert ^2)\\&\qquad + (P-1) ({\mathcal {S}}\mu _1)_g^*({\mathcal {S}}\mu _1)_{{\tilde{g}}} {\mathbb {E}}_{{\textbf{q}}} (e^{j\langle \omega _{{\tilde{g}}}-\omega _g,q_{r}\rangle } \vert \mu _2(q_r)\vert ^2 )\\&\qquad + (p-1) ({\mathcal {S}}\mu _2)_g({\mathcal {S}}\mu _2)_{{\tilde{g}}}^* {\mathbb {E}}_{{\textbf{p}}} (e^{-j\langle \omega _{{\tilde{g}}}-\omega _g,p_{t}\rangle } \vert \mu _1 (p_t)\vert ^2 )\\&\qquad +(P-1)^2 ({\mathcal {S}}\mu _1)_g^*({\mathcal {S}}\mu _1)_{{\tilde{g}}} ({\mathcal {S}}\mu _2)_g({\mathcal {S}}\mu _2)_{{\tilde{g}}}^* \end{aligned} \end{aligned}$$
(50)
with
$$\begin{aligned} \begin{aligned}&\sum _{g,{\tilde{g}}} {\mathbb {E}}_{{\textbf{p}}} (e^{-j\langle \omega _{{\tilde{g}}}-\omega _g,p_{t}\rangle } \vert \mu _1 (p_t)\vert ^2 ) {\mathbb {E}}_{{\textbf{q}}} (e^{j\langle \omega _{{\tilde{g}}}-\omega _g,q_{r}\rangle } \vert \mu _2(q_r)\vert ^2)\\&\quad ={\mathbb {E}}_{{\textbf{p}}} {\mathbb {E}}_{{\textbf{q}}} \left( \sum _{g,{\tilde{g}}}e^{-j\langle \omega _{{\tilde{g}}}-\omega _g,p_{t}\rangle } e^{j\langle \omega _{{\tilde{g}}}-\omega _g,q_{r}\rangle } \right) \vert \mu _1 (p_t)\vert ^2\vert \mu _2(q_r)\vert ^2. \end{aligned} \end{aligned}$$
(51)
We have inside the expectation,
$$\begin{aligned} \begin{aligned}&\left( \sum _{g,{\tilde{g}}}e^{j\langle \omega _{{\tilde{g}}},q_r-p_t\rangle } e^{-j\langle \omega _g,q_{r}-p_t\rangle } \right) \vert \mu _1 (p_t)\vert ^2\vert \mu _2(q_r)\vert ^2\\&\quad = \left( \sum _{{\tilde{g}}}e^{j\langle \omega _{{\tilde{g}}},q_r-p_t\rangle } \sum _{g}e^{-j\langle \omega _g,q_{r}-p_t\rangle } \right) \vert \mu _1 (p_t)\vert ^2\vert \mu _2(q_r)\vert ^2\\&\quad = \left( \sum _{{\tilde{g}}}e^{j\langle \omega _{{\tilde{g}}},q_r-p_t\rangle } \sum _{g}e^{-j\langle \omega _g,q_{r}-p_t\rangle } \right) \vert \mu _1 (p_t)\vert ^2\vert \mu _2(q_r)\vert ^2. \end{aligned} \end{aligned}$$
(52)
This gives
$$\begin{aligned} \begin{aligned}&P^2{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) \rangle \vert ^2\\&\quad = {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert [{\mathcal {S}}^* {\textbf{1}}] (q_r-p_t)\vert ^2 \vert \mu _1(p_t)\vert ^2\vert \mu _2(q_r) \vert ^2\\&\quad = \langle \vert \mu _1\vert ^2,\vert \mu _2 \vert ^2\rangle _{L^2({\mathcal {D}}),\vert {\mathcal {S}}^* {\textbf{1}} \vert ^2} \end{aligned} \end{aligned}$$
(53)
where we define for a kernel h (a function from \({\mathcal {D}}\) to \({\mathbb {R}}\)), and two measures \(\nu _1, \nu _2\), \(\langle \nu _1, \nu _2 \rangle _{L^2({\mathcal {D}}), {h}} = \int _{x,y} \nu _1(x)\nu _2(y)h(x-y) dxdy\).
We calculate the second term (and similarly the third term) of the right-hand side of (50).
$$\begin{aligned} \begin{aligned}&\sum _{g,{\tilde{g}}} ({\mathcal {S}}\mu _1)_g^*({\mathcal {S}}\mu _1)_{{\tilde{g}}} {\mathbb {E}}_{{\textbf{q}}} ( e^{j\langle \omega _{{\tilde{g}}}-\omega _g,q_{r}\rangle } \vert \mu _2(q_r)\vert ^2)\\&\quad {\mathbb {E}}_{{\textbf{q}}} \sum _{g,{\tilde{g}}} e^{-j\langle \omega _g,q_{r}\rangle }({\mathcal {S}}\mu _1)_g^*e^{j\langle \omega _{{\tilde{g}}},q_{r}\rangle }({\mathcal {S}}\mu _1)_{{\tilde{g}}} \vert \mu _2(q_r)\vert ^2\\&\quad = {\mathbb {E}}_{{\textbf{q}}} \left( \vert {\mathcal {S}}^*{\mathcal {S}}\mu _1 \vert ^2_{q_r}\vert \mu _2(q_r)\vert ^2\right) = \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _1 \vert ^2, \vert \mu _2 \vert ^2\rangle _{L^2({\mathcal {D}})}. \end{aligned} \end{aligned}$$
(54)
The fourth term of the right-hand side of (50) yields
$$\begin{aligned} \begin{aligned}&\sum _{g,{\tilde{g}}} ({\mathcal {S}}\mu _1)_g^*({\mathcal {S}}\mu _1)_{{\tilde{g}}} ({\mathcal {S}}\mu _2)_g({\mathcal {S}}\mu _2)_{{\tilde{g}}}^* \\&\quad = \sum _g ({\mathcal {S}}\mu _1)_g^* ({\mathcal {S}}\mu _2)_g\langle {\mathcal {S}}\mu _1, {\mathcal {S}}\mu _2\rangle = \vert \langle {\mathcal {S}}\mu _1, {\mathcal {S}}\mu _2\rangle \vert ^2. \end{aligned} \end{aligned}$$
(55)
Going back to (48), we have used expressions (53), (54) and (55) in (50)
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) \rangle \vert ^2\\&\quad = \frac{1}{P^2} \langle \vert \mu _1\vert ^2,\vert \mu _2 \vert ^2\rangle _{L^2({\mathcal {D}}),\vert {\mathcal {S}}^* {\textbf{1}} \vert ^2}\\&\qquad + \frac{P-1}{P^2} \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _1 \vert ^2, \vert \mu _2 \vert ^2\rangle _{L^2({\mathcal {D}})} \\&\qquad + \frac{P-1}{P^2} \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _2 \vert ^2, \vert \mu _1 \vert ^2\rangle _{L^2({\mathcal {D}})} \\&\qquad + \frac{(P-1)^2}{P^2} \vert \langle {\mathcal {S}}\mu _1, {\mathcal {S}}\mu _2\rangle \vert ^2. \end{aligned} \end{aligned}$$
(56)
By developing expressions, we have
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \\&\qquad - {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \vert ^2 \\&\quad = {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \vert ^2- \vert \langle {\mathcal {S}}\mu _1, {\mathcal {S}} \mu _2 - z \rangle \vert ^2 \\&\quad ={\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}})\rangle \vert ^2 \\&\qquad - 2 {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} {\mathcal {R}}e\left( \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z \rangle \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}})\rangle ^*\right) \\&\qquad +{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z \rangle \vert ^2- \vert \langle {\mathcal {S}}\mu _1, {\mathcal {S}} \mu _2 - z \rangle \vert ^2 \\&\quad ={\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}})\rangle \vert ^2 \\&\qquad - 2 {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} {\mathcal {R}}e\left( \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z \rangle \langle B_{\textbf{p}}\mu _1({\textbf{p}}), {\mathcal {S}}\mu _2\rangle ^*\right) \\&\qquad +{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), z \rangle \vert ^2- \vert \langle {\mathcal {S}}\mu _1, {\mathcal {S}} \mu _2 - z \rangle \vert ^2. \end{aligned} \end{aligned}$$
(57)
Using Eq. (56) and the fact that \( {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} {\mathcal {R}}e\langle B_{\textbf{p}}\mu _1({\textbf{p}}), z \rangle = {\mathcal {R}}e\langle {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} B_{\textbf{p}}\mu _1({\textbf{p}}), z \rangle = {\mathcal {R}}e\langle {\mathcal {S}}\mu _1, z \rangle \) with Lemma 6.1, Eq. (45), we have
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \\&\qquad - {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \vert ^2 \\&\quad = \frac{1}{P^2} \langle \vert \mu _1\vert ^2,\vert \mu _2 \vert ^2\rangle _{L^2({\mathcal {D}}),\vert {\mathcal {S}}^* {\textbf{1}} \vert ^2}\\&\qquad + \frac{P-1}{P^2} \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _1 \vert ^2, \vert \mu _2 \vert ^2\rangle _{L^2({\mathcal {D}})} \\&\qquad + \frac{P-1}{P^2} \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _2 \vert ^2, \vert \mu _1 \vert ^2\rangle _{L^2({\mathcal {D}})} \\&\qquad + \frac{(P-1)^2}{P^2} \vert \langle {\mathcal {S}}\mu _1, {\mathcal {S}}\mu _2\rangle \vert ^2\\&\qquad -2\frac{1}{P} {\mathcal {R}}e\langle ({\mathcal {S}}^*z)({\mathcal {S}}^*{\mathcal {S}}\mu _2)^*, \vert \mu _1 \vert ^2\rangle _{L^2}\\&\qquad -2 \frac{P-1}{P} {\mathcal {R}}e\left( \langle {\mathcal {S}}\mu _1,z \rangle \langle {\mathcal {S}}\mu _1,{\mathcal {S}}\mu _2 \rangle ^*\right) \\&\qquad + \frac{1}{P}\langle \vert {\mathcal {S}}^*z\vert ^2, \vert \mu _1 \vert ^2\rangle _{L^2({\mathcal {D}})}\\&\qquad +\frac{P-1}{P} \vert \langle {\mathcal {S}}\mu _1,z \rangle \vert ^2\\&\qquad - \vert \langle {\mathcal {S}}\mu _1,z \rangle \vert ^2 \\&\qquad + 2 {\mathcal {R}}e\left( \langle {\mathcal {S}}\mu _1,z \rangle \langle {\mathcal {S}}\mu _1,{\mathcal {S}}\mu _2 \rangle ^*\right) \\&\qquad - \vert \langle {\mathcal {S}}\mu _1,{\mathcal {S}}\mu _2\rangle \vert ^2. \end{aligned} \end{aligned}$$
(58)
Regrouping terms yields
$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \vert \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \\&\qquad - {\mathbb {E}}_{{\textbf{p}},{\textbf{q}}} \langle B_{\textbf{p}}\mu _1({\textbf{p}}), B_{\textbf{q}}\mu _2({\textbf{q}}) -z \rangle \vert ^2 \\&\quad = \frac{1}{P^2} \langle \vert \mu _1\vert ^2,\vert \mu _2 \vert ^2\rangle _{L^2,\vert {\mathcal {S}}^* {\textbf{1}} \vert ^2}\\&\qquad + \frac{P-1}{P^2} \left( \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _1 \vert ^2, \vert \mu _2 \vert ^2\rangle _{L^2} + \langle \vert {\mathcal {S}}^*{\mathcal {S}}\mu _2 \vert ^2, \vert \mu _1 \vert ^2\rangle _{L^2} \right) \\&\qquad + \frac{1}{P} {\mathcal {R}}e\langle \vert {\mathcal {S}}^*z\vert ^2 -2{\mathcal {S}}^*z({\mathcal {S}}^*{\mathcal {S}}\mu _2)^*, \vert \mu _1 \vert ^2\rangle _{L^2}\\&\qquad + \frac{2}{P} {\mathcal {R}}e\left( \langle {\mathcal {S}}\mu _1,z \rangle \langle {\mathcal {S}}\mu _1,{\mathcal {S}}\mu _2 \rangle ^*\right) \\&\qquad - \frac{1}{P}\vert \langle {\mathcal {S}}\mu _1,z \rangle \vert ^2 \\&\qquad - \frac{2P-1}{P^2} \vert \langle {\mathcal {S}}\mu _1,{\mathcal {S}}\mu _2\rangle \vert ^2. \end{aligned} \end{aligned}$$
(59)
\(\square \)
We have that the variance converges to 0 at the typical rate 1/P.