Appendix
1.1 Appendix A.1: Some lemmas
Before proving the main result, we introduce some lemmas, which will be used several times in our proof.
Lemma A.1
(Hoeffding inequality) Let \(X_{1}, \ldots , X_{n}\) be a series of independent random variables. Assume that \(P\left( X_{i} \in \right. \) \(\left. \left[ a_{i}, b_{i}\right] \right) =1\) for \(1 \le i \le n,\) where \(a_{i}\) and \(b_{i}\) are constants. Let \({\bar{X}}=n^{-1} \sum _{i=1}^{n} X_{i} .\) Then the following inequality holds
$$\begin{aligned} P(|{\bar{X}}-E({\bar{X}})| \ge t) \le 2 \exp \left( -\frac{2 n^{2} t^{2}}{\sum _{i=1}^{n}\left( b_{i}-a_{i}\right) ^{2}}\right) . \end{aligned}$$
Lemma A.2
Suppose a random variable U satisfying the sub-exponential condition that \(E\exp (sU)<\infty \) for some constant \(s>0\), and \(U_1,\ldots ,U_n\) are i.i.d samples from U. Then, for any \(\varepsilon >0,\) there exist some positive constants \(0<\alpha <1 / 2, c_0\) and t such that
$$\begin{aligned} P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U)\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +c_0 n\exp \left( -s n^{\alpha }\right) \end{aligned}$$
(A.1)
Proof
For any \(M>0\),
$$\begin{aligned}&P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon \right) \\&\quad =P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| \le M\right) \\&\qquad +P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| >M\right) \\&\quad \le P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| \le M\right) +P\left( |U_i| \ge M \text{ for } \text{ some } i\right) \\&\quad \le P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| \le M\right) +n P\left( |U_i| \ge M\right) \\&\quad \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) + c_0 n\exp \left( -s n^{\alpha }\right) , \end{aligned}$$
where the last inequality holds because of the hoeffding inequality and the fact \(P\left( \left| U_i\right| \ge M\right) \le 2E\exp (sU) \exp (-sM),\) which is deduced by the Markov inequality. Setting \(M=n^{\alpha }, 0<\alpha <1/2\) and \(c_0=E\exp (sU)\), (A.1) is proved. \(\square \)
Lemma A.3
By Condition (C1), we have that
$$\begin{aligned}&P\left( \left| \widehat{\text{ Cov }}(X_j^2,Y_k^2)-\text{ Cov }(X_j^2,Y_k^2))\right| \ge \varepsilon \right) \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) , \end{aligned}$$
and
$$\begin{aligned}&P\left( \left| \widehat{\text{ Cov }}(Y_j^2,Y_k^2)-\text{ Cov }(Y_j^2,Y_k^2))\right| \ge \varepsilon \right) \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$
Proof
If we set \(U=X_j^2Y_k^2\) for \(j\in \{1,\ldots ,p\}\) and \(k\in \{1,\ldots ,q\}\), it has that
$$\begin{aligned} P\left( \left| {\widehat{E}}(X_j^2Y_k^2)-E(X_j^2Y_k^2)\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$
By simple algebra, it has that
$$\begin{aligned} {\widehat{E}}(X_j^2){\widehat{E}}(Y_k^2)-E(X_j^2){\widehat{E}}(Y_k^2)&={\widehat{E}}(X_j^2)\{{\widehat{E}}(Y_k^2)-E(Y_k^2)\}\\&\quad +\{{\widehat{E}}(X_j^2)-E(X_j^2)\}{\widehat{E}}(Y_k^2). \end{aligned}$$
As a result,
$$\begin{aligned}&P\left( \left| {\widehat{E}}(X_j^2){\widehat{E}}(Y_k^2)-E(X_j^2){\widehat{E}}(Y_k^2)\right| \ge \varepsilon \right) \\&\quad \le P\left( \left| {\widehat{E}}(X_j^2)\{{\widehat{E}}(Y_k^2)-E(Y_k^2)\}\right| \ge \varepsilon /2\right) \\&\qquad +P\left( \left| \{{\widehat{E}}(X_j^2)-E(X_j^2)\}{\widehat{E}}(Y_k^2)\right| \ge \varepsilon /2\right) \\&\quad \ge 2 \exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 8\right) +2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 8\right) +2c_0 n\exp \left( -s n^{\alpha /2}\right) , \end{aligned}$$
where \(c_1\) and \(c_2\) are constants depending on \(E(X_j^2)\) and \(E(Y_k^2)\), respectively. Consequently, it can be proved that
$$\begin{aligned}&P\left( \left| \widehat{\text{ Cov }}(X_j^2,Y_k^2)-\text{ Cov }(X_j^2,Y_k^2))\right| \ge \varepsilon \right) \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$
\(\square \)
Lemma A.4
Let \(\varvec{D}=\widehat{\varvec{\Sigma }}_{\widetilde{y}}-\varvec{\Sigma }_{\widetilde{y}}\), then it has that
$$\begin{aligned} \left| \lambda _{\min }(\widehat{\varvec{\Sigma }}_{\widetilde{y}})-\lambda _{\min }({\varvec{\Sigma }}_{\widetilde{y}})\right| \le q\Vert \varvec{D}\Vert _{\infty }. \end{aligned}$$
The proof can be referred to Lemma 4 in Li et al. (2017). The following lemma is also from Li et al. (2017), which can be easily proved by the Hoeffding inequality and Markov inequality and some matrix theories.
Lemma A.5
For any \(\varepsilon >0\),
$$\begin{aligned} P\left( \left| \lambda _{\min }\left( \widehat{\varvec{\Sigma }}_{\varvec{y}}\right) -\lambda _{\min }\left( {\varvec{\Sigma }}_{\varvec{y}}\right) \right| \ge \varepsilon \right) \le q_{n}^{2}\left( 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2 q_{n}^{2}\right) +A \exp \left( -sn^{\alpha }\right) \right) \end{aligned}$$
In addition, there exist some positive constants \(c_{2}\) and \(c_{3}\) such that
$$\begin{aligned} P\left\{ \left| \Vert \widehat{\varvec{\Sigma }}_{\varvec{y}}^{-1}\Vert -\Vert {\varvec{\Sigma }}_{\varvec{y}}^{-1}\Vert \right| \ge c_{3}\Vert \varvec{\Sigma }_{\varvec{y}}^{-1}\Vert \right\} \le q_{n}^{2}\left( 2 \exp \left( -c_{2} n^{1-2 \alpha } q_{n}^{-4}\right) + A \exp \left( -sn^{\alpha }\right) \right) , \end{aligned}$$
where \(\Vert \varvec{B}\Vert =\lambda _{{\textit{max}}}(\varvec{B}^\top \varvec{B})\) be the maximal eigenvalue operator for matrix \(\varvec{B}^\top \varvec{B}\) for any matrix \(\varvec{B}\).
1.2 Appendix A.2: Proof of Theorem 3.1
To prove the consistency of \(\widehat{\varvec{\Phi }}_j\), we need to make a decomposition on \(\widehat{\varvec{\Phi }}_j\). Note that \(\varvec{\Sigma }_{\widetilde{\varvec{x}}_j}\) is a \(2\times 2\) matrix, denoted as
$$\begin{aligned} \varvec{\Sigma }_{\widetilde{\varvec{x}}_j}=\left( \begin{array}{cc}\sigma ^2_{j,11}&{}\quad \sigma ^2_{j,12}\\ \sigma ^2_{j,21}&{}\quad \sigma ^2_{j,22}\end{array}\right) , \end{aligned}$$
where \(\sigma ^2_{j,11}=\text{ Cov }(X_j^2,X_j^2)\), \(\sigma ^2_{j,12}=\text{ Cov }(X_j^2,X_j)\) and \(\sigma ^2_{j,22}=\text{ Cov }(X_j,X_j)\). Let \(\varvec{s}_j\) be the j-th column of \(\varvec{\Sigma }_{\widetilde{\varvec{x}}_j\widetilde{\varvec{y}}}\), then
$$\begin{aligned} {\varvec{\Phi }}_j= & {} \left( \varvec{\Sigma }_{\widetilde{\varvec{x}}_j}\right) ^{-1} \varvec{\Sigma }_{\widetilde{\varvec{x}}_j\widetilde{\varvec{y}}} \left( \varvec{\Sigma }_{\widetilde{\varvec{y}}}\right) ^{-1} \varvec{\Sigma }_{\widetilde{\varvec{y}}\widetilde{\varvec{x}}_j}\\= & {} \left( \begin{array}{cc} \sigma ^2_{j,11}&{}\quad \sigma ^2_{j,12}\\ \sigma ^2_{j,21}&{}\quad \sigma ^2_{j,22}\end{array}\right) ^{-1} \left( \begin{array}{c}\varvec{s}_1^\top \\ \varvec{s}_2^\top \end{array}\right) \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\left( \varvec{s}_1, \varvec{s}_2\right) \\= & {} \left( \begin{array}{cc}\gamma _{j,11}&{}\quad \gamma _{j,12}\\ \gamma _{j,21}&{}\quad \gamma _{j,22}\end{array}\right) \left( \begin{array}{cc} \varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1 &{}\quad \varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2 \\ \varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1 &{}\quad \varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2 \end{array}\right) \\= & {} \left( \begin{array}{cc} \underbrace{\gamma _{j,11}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1+\gamma _{j,12}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1}_{\phi _j^{11}}&{}\quad \underbrace{\gamma _{j,11}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2+\gamma _{j,12}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2}_{\phi _j^{12}}\\ \underbrace{\gamma _{j,21}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1+\gamma _{j,22}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1}_{\phi _j^{21}}&{} \quad \underbrace{\gamma _{j,21}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2+\gamma _{j,22}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2}_{\phi _j^{22}} \end{array}\right) \end{aligned}$$
Similarly, we can decompose the sample analogy \(\widehat{\varvec{\Phi }}_j\) of \(\varvec{\Phi }_j\) as
$$\begin{aligned} \widehat{\varvec{\Phi }}_j= & {} \left( \begin{array}{cc} {\hat{\phi }}_j^{11}&{}\quad {\hat{\phi }}_j^{12}\\ {\hat{\phi }}_j^{21}&{}\quad {\hat{\phi }}_j^{22} \end{array}\right) \\= & {} \left( \begin{array}{cc} {\hat{\gamma }}_{j,11}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1+\gamma _{j,12}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1&{}\quad {\hat{\gamma }}_{j,11}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2+\gamma _{j,12}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2\\ {\hat{\gamma }}_{j,21}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1+\gamma _{j,22}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1&{}\quad {\hat{\gamma }}_{j,21}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2+\gamma _{j,22}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2 \end{array}\right) . \end{aligned}$$
Since
$$\begin{aligned} \omega _j=\frac{1}{2}\left( {\hat{\phi }}_j^{11}+{\hat{\phi }}_j^{22} +\sqrt{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2 +4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}}\right) , \end{aligned}$$
where \(\phi _j^{11}\) is the sample version of \(\phi _j^{11}\), similar definitions go to \(\phi _j^{12}, \phi _j^{21}\) and \(\phi _j^{22}\).
We first prove the consistency for \({\hat{\phi }}_j^{11}={\hat{\gamma }}_{j,11}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1+{\hat{\gamma }}_{j,12}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1\). It is sufficient to bound the first item in \({\hat{\phi }}_j^{11}\) because the other item shares the same asymptotic behavior. The proof takes a very similar spirit to Li et al. (2017), here we only give the outline of the proof. Denote \(\Vert \varvec{s}\Vert \) as the Euclidean norm of a vector \(\varvec{s}\). For notational simplicity, let \(\varvec{A}=\varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\) and \(\widehat{\varvec{A}}=\widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\). By some simple algebra, it has that
$$\begin{aligned}&\hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2-\varvec{s}_1^\top \varvec{A}\varvec{s}_2\\&\quad = \underbrace{(\hat{\varvec{s}}_1-{\varvec{s}}_1)^\top \widehat{\varvec{A}}(\hat{\varvec{s}}_2-{\varvec{s}}_2)}_{I_{11}} +\underbrace{(\hat{\varvec{s}}_1-{\varvec{s}}_1)^\top \widehat{\varvec{A}}{\varvec{s}}_2}_{I_{12}}+ \underbrace{{\varvec{s}}_1^\top \widehat{\varvec{A}}(\hat{\varvec{s}}_2-{\varvec{s}}_2)}_{I_{13}}+ \underbrace{\varvec{s}_1^\top (\widehat{\varvec{A}}-\varvec{A})\varvec{s}_2}_{I_{14}} \end{aligned}$$
We then bound \(I_{1i}\) for \(i=1,\ldots ,4\) one by one. For the first item \(I_{11}\), note that
$$\begin{aligned} |I_{11}|\le \Vert \widehat{\varvec{A}}\Vert \Vert \hat{\varvec{s}}_1-\varvec{s}_1\Vert \Vert \hat{\varvec{s}}_2-\varvec{s}_2\Vert \end{aligned}$$
By Lemma A.3, it has that
$$\begin{aligned}&P\left( \Vert \hat{\varvec{s}}_1-\varvec{s}_1\Vert ^2 \ge q\varepsilon ^2 \right) \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \nonumber \\&\quad + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) +2c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$
(A.2)
Similarly, the probability bound also holds for item \(\hat{\varvec{s}}_2-\varvec{s}_2\). We next bound the matrix \(\widehat{\varvec{A}}\), for some constant \(c_3\), it has that
$$\begin{aligned}&P \left\{ \left| \Vert \widehat{\varvec{A}}\Vert -\Vert \varvec{A}\Vert \right| \ge c_{3}\Vert \varvec{A}\Vert \right\} \\&\quad = P\left\{ \left| \lambda _{\max }(\widehat{\varvec{A}})-\lambda _{\max }(\varvec{A})\right| \ge c_3 \lambda _{\max }(\varvec{A})\right\} \\&\quad \le P\left\{ \left| \lambda ^{-1}_{\min }(\widehat{\varvec{A}}^{-1})-\lambda ^{-1}_{\min }({\varvec{A}}^{-1})\right| \ge c_{3}\lambda ^{-1}_{\min }(\varvec{A}^{-1})\right\} \\&\quad \le P\left\{ \left| \lambda _{\min }(\widehat{\varvec{A}}^{-1})-\lambda _{\min }({\varvec{A}}^{-1})\right| \ge c_3\lambda _{\min }(\varvec{A}^{-1})\right\} \\&\quad \le P\left\{ \left| \lambda _{\min }(\widehat{\varvec{A}}^{-1})-\lambda _{\min }({\varvec{A}}^{-1})\right| \ge c_3C_1q^{-1}\right\} \\&\quad \le P\left( q\Vert \varvec{D}\Vert _\infty \ge c_3C_1q^{-1}\right) \\&\quad \le 2q^2\left\{ \exp \left( -n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 8\right) +\exp \left( -c_1n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 32\right) \right. \\&\qquad + \left. \exp \left( -c_2n^{1-2 \alpha }c_3^2C_1^2q^{-4} / 32\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) \right\} , \end{aligned}$$
where the last second inequality holds because of Lemma A.4.
By condition (C2), it follows that
$$\begin{aligned}&P\left\{ \Vert \widehat{\varvec{A}}\Vert \ge \left( c_{3}+1\right) C_2 q^{-1}\right\} \le P\left\{ \Vert \widehat{\varvec{A}}\Vert \ge \left( c_{3}+1\right) \Vert \varvec{A}\Vert \right\} \nonumber \\&\quad =P\left\{ \Vert \widehat{\varvec{A}}\Vert -\Vert \varvec{A}\Vert \ge c_{3}\Vert \varvec{A}\Vert \right\} \le P\left\{ \left| \Vert \widehat{\varvec{A}}\Vert -\Vert \varvec{A}\Vert \right| \ge c_{3}\Vert \varvec{A}\Vert \right\} \nonumber \\&\quad \le 2q^2\left\{ \exp \left( -n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 8\right) +\exp \left( -c_1n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 32\right) \right. \nonumber \\&\qquad + \left. \exp \left( -c_2n^{1-2 \alpha }c_3^2C_1^2q^{-4} / 32\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) \right\} . \end{aligned}$$
(A.3)
Consequently,
$$\begin{aligned}&P\left\{ \left| I_{11}\right| \ge \left( c_{3}+1\right) C_2\varepsilon ^{2}\right\} \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) + 2q^2\left\{ \exp \left( -n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 8\right) \right. \\&\qquad +\exp \left( -c_1n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 32\right) \\&\qquad + \left. \exp \left( -c_2n^{1-2 \alpha }c_3^2C_1^2q^{-4} / 32\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) \right\} . \end{aligned}$$
Or equivalently,
$$\begin{aligned}&P\left\{ \left| I_{11}\right| \ge \varepsilon \right\} \\&\quad \le 2\exp \left( O\left\{ -n^{1-2 \alpha } \varepsilon ^{2}\right\} \right) + 2q^2 \exp \left( -O\left\{ n^{1-2 \alpha }q^{-4}\right\} \right) +c_0n\exp \left( -s n^{\alpha /2}\right) \end{aligned}$$
where \(B_1\) and \(B_2\) are two constants depending on \(c_1,c_2,c_3,C_1\) and \(C_2\).
We then prove the convergence rate for the second term \(I_{12}\). It has that
$$\begin{aligned} |I_{12}|\le \Vert \hat{\varvec{s}}_1-\varvec{s}\Vert \Vert \widehat{\varvec{A}}\varvec{s}_2\Vert . \end{aligned}$$
For \(\Vert \widehat{\varvec{A}}\varvec{s}_2\Vert \), by triangle inequality, it follows that
$$\begin{aligned} \Vert \widehat{\varvec{A}}\varvec{s}_2\Vert \le \Vert (\widehat{\varvec{A}}-\varvec{A})\varvec{s}_2 \Vert + \Vert \varvec{A}\varvec{s}_2\Vert \le \Vert \widehat{\varvec{A}}-\varvec{A}\Vert \Vert \varvec{s}_2 \Vert + \Vert \varvec{A}\varvec{s}_2\Vert . \end{aligned}$$
For \(\Vert \widehat{\varvec{A}}-\varvec{A}\Vert \Vert \varvec{s}_2 \Vert \), it has that
$$\begin{aligned} \Vert \widehat{\varvec{A}}-\varvec{A}\Vert =\Vert \widehat{\varvec{A}}\varvec{A}(\varvec{A}^{-1}-\widehat{\varvec{A}}^{-1})\Vert \le \Vert \widehat{\varvec{A}}\Vert \Vert \varvec{A}\Vert \Vert \varvec{A}^{-1}-\widehat{\varvec{A}}^{-1}\Vert \le q^2C_1\Vert \widehat{\varvec{A}}\Vert \Vert \varvec{D}\Vert , \end{aligned}$$
where the last inequality holds because \(\Vert \varvec{A}\Vert =\lambda _{\min }^{-1}(\varvec{A}^{-1})\le (C_1/q)^{-1}\) and \(\Vert \varvec{A}^{-1}-\widehat{\varvec{A}}^{-1}\Vert \le q\Vert \varvec{D}\Vert \). As \(\Vert \varvec{D}\Vert _\infty \) converges to zero by the central limit theorem and the boundedness of \(\Vert \widehat{\varvec{A}}\Vert \) proved by (A.3), \(\Vert \widehat{\varvec{A}}-\varvec{A}\Vert \Vert \varvec{s}_2 \Vert \) is negligible as \(n\rightarrow \infty \). Thus \(\Vert \varvec{A}\varvec{s}_2\Vert =\Vert \varvec{A}\Vert \Vert \varvec{s}_2\Vert =\Vert \varvec{s}_2\Vert /\lambda _{\min }(\varvec{A}^{-1})\le \Vert \varvec{s}_2\Vert q/C_1\) controls the convergence rate of \(\Vert \widehat{\varvec{A}}\varvec{s}_2\Vert \). It follows that \(\Vert \varvec{A}\varvec{s}_2\Vert ^2 \le \Vert \varvec{s}_2\Vert ^2q^2/C^2_1\le q^3C_1^2\Vert \varvec{s}_2\Vert ^2_\infty \). Note that \(\Vert \varvec{s}_2\Vert ^2_\infty \) is finite under Condition (C1) in Section 3 as it an expectation. Finally, the above results, together with (A.2), it has that
$$\begin{aligned} P\left( |I_{12}| \ge \varepsilon \right)\le & {} P\left( \Vert \widehat{\varvec{s}}_2-\varvec{s}_2\Vert ^2 \ge q^{-2}\varepsilon ^2C_1^2/\Vert \varvec{s}\Vert ^2_\infty \right) \nonumber \\\le & {} 2\exp \left( -C_1^2q^{-4}n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1C_1^2q^{-4}n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \nonumber \\&+ 2 \exp \left( -c_2C_1^2q^{-4}n^{1-2 \alpha } \varepsilon ^{2} / 32\right) +2c_0 n\exp \left( -s n^{\alpha /2}\right) \nonumber \\= & {} O\{\exp \left( -O\{q^{-4}n^{1-2 \alpha } \varepsilon ^{2}\}\right) +n\exp \left( -s n^{\alpha /2} \right) \} \end{aligned}$$
(A.4)
The item \(I_{13}\) and \(I_{14}\) have the same asymptotic behavior as \(I_{12}\), we thus omit the details here for space saving. Consequently,
$$\begin{aligned}&P\left\{ \left| \hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2-\varvec{s}_1^\top \varvec{A}\varvec{s}_2\right| \ge \varepsilon \right\} \\&\quad \le n\exp \left( -s n^{\alpha /2}\right) + 2q^2 \exp \left( -O\{n^{1-2 \alpha }q^{-4}\}\right) +\exp \left( -O\{q^{-4}n^{1-2 \alpha } \varepsilon ^{2}\}\right) . \end{aligned}$$
Note that when aggregating the boundedness of \(P\left( |I_{11}| \ge \varepsilon \right) \) and \(P\left( |I_{12}| \ge \varepsilon \right) \), we delete the negligible term which does not control the convergence rate of \(\hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2\) .
Next, we prove the consistency of \({\hat{\gamma }}_{j,12}\). For simplicity, denote \({{\hat{g}}}_j={\hat{\sigma }}^2_{j,11}{\hat{\sigma }}^2_{j,22}-{\hat{\sigma }}_{j,12}^2{\hat{\sigma }}_{j,21}^2\), and \(g_j=\sigma ^2_{j,11}\sigma ^2_{j,22}-\sigma _{j,12}^2\sigma _{j,21}^2\). So, it can be derived that
$$\begin{aligned}&P(|{\hat{\gamma }}_{j,12}-\gamma _{j,12}|\ge \varepsilon ) =P\left( \frac{1}{{{\hat{g}}}_j}\left| \frac{g_j{\hat{\sigma }}_{j,12}^2-{{\hat{g}}}_j\sigma _{j,12}^2}{g_j}\right| \ge \varepsilon \right) \\&\quad \le P\left( \frac{1}{{{\hat{g}}}_j}\left| \frac{g_j{\hat{\sigma }}_{j,12}^2-{{\hat{g}}}_j\sigma _{j,12}^2}{g_j}\right| \ge \varepsilon , {{\hat{g}}}_j\le N\right) \\&\qquad +P\left( \frac{1}{{{\hat{g}}}_j}\left| \frac{g_j{\hat{\sigma }}_{j,22}^2-{{\hat{g}}}_j\sigma _{j,22}^2}{g_j}\right| \ge \varepsilon , {{\hat{g}}}_j> N\right) \\&\quad \le P\left( \left| {g_j{\hat{\sigma }}_{j,12}^2-{{\hat{g}}}_j\sigma _{j,12}^2}\right| \ge {g_j}\varepsilon /N\right) +P({{\hat{g}}}_j>N)\\&\quad \le P\left( \left| {g_j({\hat{\sigma }}_{j,12}^2-\sigma _{j,12}^2)|+|\sigma _{j,12}^2({{\hat{g}}}_j-g_j)}\right| \ge {g_j}\varepsilon /N\right) +P({{\hat{g}}}_j>N) \end{aligned}$$
By similar proving technique to the proof of Lemma A.3, we can get a similar upper bound for the probability \(P(|{\hat{\gamma }}_j-\gamma _j|\ge \varepsilon )\), namely, there exists some constant \(b_0\), \(b_1\) and \(b_2\) such that
$$\begin{aligned}&P(|{\hat{\gamma }}_{j,12}-\gamma _{j,12}|\ge \varepsilon )\\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -b_1n^{1-2 \alpha } \varepsilon ^{2}\right) + 2 \exp \left( -b_2n^{1-2 \alpha } \varepsilon ^{2}\right) \\&\qquad +2b_0 n\exp \left( -s n^{\alpha /2}\right) \\&\quad =\exp \left( -O\{n^{1-2 \alpha } \varepsilon ^{2}\}\right) +2b_0 n\exp \left( -s n^{\alpha /2}\right) \end{aligned}$$
For simplicity, let \({\hat{\delta }}=\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1\) and \(\delta ={\varvec{s}}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}{\varvec{s}}_1\), then
$$\begin{aligned} {\hat{\gamma }}_{j,11}{\hat{\delta }}-\gamma _{j,11}\delta =\underbrace{{\hat{\gamma }}_{j,11}({\hat{\delta }}-\delta )}_{J_1}+\underbrace{({\hat{\gamma }}_{j,11}-\gamma _{j,11})\delta }_{J_2}. \end{aligned}$$
Upper bound of \(J_2\) has the same order as \(|{\hat{\gamma }}_{j,11}-\gamma _{j,11}|\) because \(\delta \) is bounded. The bound of \(J_1\) is also easily obtained because it follows that
$$\begin{aligned} P(|{\hat{\gamma }}_{j,11}({\hat{\delta }}-\delta )|\ge \varepsilon )\le P(|{\hat{\delta }}-\delta |\ge \varepsilon /H)+P(|{\hat{\gamma }}_{j,11}|\ge H), \end{aligned}$$
where the second term is negligible if we set \(H=n^\xi \). Consequently,
$$\begin{aligned}&P\left\{ \left| \hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2-\varvec{s}_1^\top \varvec{A}\varvec{s}_2\right| \ge \varepsilon \right\} \\&\quad \le n\exp \left( -s n^{\alpha /2}\right) + 2q^2 \exp \left( -O\{n^{1-2 \alpha }q^{-4}\}\right) +\exp \left( -O\{q^{-4}n^{1-2 \alpha -2\xi } \varepsilon ^{2}\}\right) . \end{aligned}$$
By taking similar argument, the asymptotic behavior of \(\hat{\phi }_j^{22}\) and \(\hat{\phi }_j^{12}\) can also be obtained.
Finally, we prove the consistency of \({\hat{\omega }}_j\). Since
$$\begin{aligned}&\sqrt{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2+4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}}-\sqrt{(\phi _j^{11}-\phi _j^{22})^2+4\phi _j^{12}\phi _j^{21}}\\&\quad =\frac{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2+4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}-(\phi _j^{11}-\phi _j^{22})^2+4\phi _j^{12}\phi _j^{21}}{\sqrt{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2+4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}}+\sqrt{(\phi _j^{11}-\phi _j^{22})^2+4\phi _j^{12}\phi _j^{21}}}. \end{aligned}$$
Thus it is sufficient to prove the consistency of the numerator as the denominator is bounded in probability. It is not difficult to show that the numerator shares the same convergence rate as \({\hat{\phi }}_j^{11}\), here we omit the details for space saving. In summary, it can be shown \({\hat{\omega }}_j\) has the same convergence rate as \({\hat{\phi }}_j^{11}\). Consequently, it has that
$$\begin{aligned}&P\left\{ |{\hat{\omega }}_j-\omega _j| \ge \varepsilon \right\} \\&\quad \le n\exp \left( -s n^{\alpha /2}\right) + 2q^2 \exp \left( -O\{n^{1-2 \alpha }q^{-4}\}\right) +\exp \left( -O\{q^{-4}n^{1-2 \alpha -2\xi } \varepsilon ^{2}\}\right) \\&\qquad +\exp \left( O\{-n^{1-2 \alpha } \varepsilon ^{2}\}\right) +2b_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$
Let \(\varepsilon =c_\omega n^\tau \) and the condition \(q=O(n^\kappa )\), then it follows that
$$\begin{aligned}&P\left\{ \max _{1\le j\le p}\left| {\hat{\omega }}_j-\omega _j\right| \ge c_\omega n^{-\eta } \right\} \le \sum _{j=1}^p P\left\{ \left| {\hat{\omega }}_j-\omega _j\right| \ge c_\omega n^{-\eta } \right\} \\&\quad \le p\left( O\{n\exp \left( -s n^{\alpha /2}\right) \} + 2n^{2\kappa } \exp \left( -O\{n^{1-2 \alpha -4\kappa }\}\right) \right. \\&\qquad \left. + \exp \left( -O\{c_\omega ^2 n^{1-2\alpha -2\tau -2\xi -4\kappa }\}\right) \right) \end{aligned}$$
Now, we prove the second part of the theorem as follows. If \({\mathcal {A}} \subsetneq \widehat{{\mathcal {A}}}\), then there must exist some \(k\in {\mathcal {A}}\) such that \({\hat{\omega }}_k<c_\omega n^{-\eta }\). It follows from Condition (C4) that \(|{\hat{\omega }}_k-\omega _k|>(c_\omega '-c_\omega )n^{-\eta }\) for some \(k\in {\mathcal {A}}\), indicating that the events satisfy \({\mathcal {E}}_n=\left\{ {\mathcal {A}}\subsetneq \widehat{{\mathcal {A}}}\right\} \subset \{|{\hat{\omega }}_k-\omega _k|>(c_\omega '-c_\omega )n^{-\eta }, \text{ for } \text{ some } k\in {\mathcal {A}}\}\). Consequently,
$$\begin{aligned} \begin{aligned}&P({\mathcal {A}} \subset \widehat{{\mathcal {A}}}) \ge P({\mathcal {E}})=1-P({\mathcal {E}})\\&\quad = 1-P\left( \max _{j \in {\mathcal {A}}}\left| \omega _{j}-\hat{\omega }_{j}\right| \ge (c_\omega '-c_\omega ) n^{-\eta }\right) \ge 1-s_{1n}P\left( \left| \omega _{j}-\hat{\omega }_{j}\right| \ge b_\omega n^{-\eta }\right) \\&\quad \ge 1-s_{1n}\left( O\{n\exp \left( -s n^{\alpha /2}\right) \} + 2n^{2\kappa } \exp \left( -O\{n^{1-2 \alpha -4\kappa }\}\right) \right. \\&\qquad \left. + \exp \left( -O\{b_\omega ^2 n^{1-2\alpha -2\tau -2\xi -4\kappa }\}\right) \right) , \end{aligned} \end{aligned}$$
where \(b_\omega =c_\omega '-c_\omega \). \(\square \)
Proof of Corollary 3.1
Define \( \delta =\min _{j \in {\mathcal {A}}} \omega _{j}-\max _{j \in {\mathcal {A}}^{c}} \omega _{j} \), then under condition \((\mathrm {C} 5),\) it has that
$$\begin{aligned}&P\left( \min _{j \in {\mathcal {A}}} \hat{\omega }_{j} \le \max _{j \in {\mathcal {A}}^{c}} \hat{\omega }_{j}\right) \\&\quad = P\left( \min _{j \in {\mathcal {A}}} \hat{\omega }_{j}-\min _{j \in {\mathcal {A}}} \omega _{j}+\delta \le \max _{j \in {\mathcal {A}}^{c}} \hat{\omega }_{j}-\max _{j \in {\mathcal {A}}^{c}} \omega _{j}\right) \\&\quad \le P\left( \max _{j \in {\mathcal {A}}^{c}}\left| \hat{\omega }_{j}-\omega _{j}\right| +\max _{j \in {\mathcal {A}}}\left| \hat{\omega }_{j}-\omega _{j}\right| \ge \delta \right) \\&\quad \le P\left( \max _{1 \le j \le p}\left| \hat{\omega }_{j}-\omega _{j}\right| \ge \delta / 2\right) \le \sum _{j=1}^{p} P\left( \left| \hat{\omega }_{j}-\omega _{j}\right| \ge \delta / 2\right) . \end{aligned}$$
The last term goes to 0 as \(n \rightarrow \infty \) when p satisfies some condition. \(\square \)
Proof of Theorem 3.2
We only prove the consistency of \({\hat{\rho }}_j\) as the consistency for \({\hat{\pi }}_{kl}\) is similar to prove.
By setting \(U_i=X_{ij}Y_{ik}\), \(U_i=Y_{ik}Y_{il}\), respectively, it is easily proved by Lemma A.3 that
$$\begin{aligned} P\left( \left| \frac{1}{n}\sum _{i=1}^n X_{ij}Y_{ik} -E (X_jY_k)\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +A_1 \exp \left( -s n^{\alpha }\right) \end{aligned}$$
Similarly, we can bounded \(P\left( \left| \frac{1}{n}\sum _{i=1}^n Y_{ij} -E (Y_j)\right| \ge \varepsilon \right) \) and \(P\left( \left| \frac{1}{n}\sum _{i=1}^n X_{ij}\right. \right. \left. \left. -E (X_j)\right| \ge \varepsilon \right) \). Thus, it can be proved that there exists some constant A and s such that
$$\begin{aligned} P\left( \left| \frac{1}{n}\sum _{i=1}^n \widehat{\varvec{\Sigma }}_{X_{j}Y_{k}} -{\varvec{\Sigma }}_{X_{j}Y_{k}}\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +A\exp \left( -s n^{\alpha }\right) \end{aligned}$$
Note hat
$$\begin{aligned} {\hat{\rho }}_j={\hat{\sigma }}_j^{-2}\widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}} \left( \widehat{\varvec{\Sigma }}_{\varvec{y}}\right) ^{-1} \widehat{\varvec{\Sigma }}_{\varvec{y}X_j} \end{aligned}$$
Let \(\hat{\rho }^0_j=\widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}} \left( \widehat{\varvec{\Sigma }}_{\varvec{y}}\right) ^{-1} \widehat{\varvec{\Sigma }}_{\varvec{y}X_j}\) and \(\rho _j^0={\varvec{\Sigma }}_{X_j{\varvec{y}}} \left( {\varvec{\Sigma }}_{\varvec{y}}\right) ^{-1}{\varvec{\Sigma }}_{\varvec{y}X_j}.\) Then by some algebra, we have
$$\begin{aligned} {\hat{\rho }}_j^0-\rho _j^0=I_1+I_2+I_3, \end{aligned}$$
where
$$\begin{aligned}&I_1=\left( \widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}}-{\varvec{\Sigma }}_{X_j{\varvec{y}}}\right) ^\top \widehat{\varvec{\Sigma }}_{\varvec{y}}^{-1}\left( \widehat{\varvec{\Sigma }}_{{\varvec{y}}X_j}-{\varvec{\Sigma }}_{{\varvec{y}}X_j}\right) \\&I_2=2\left( \widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}}-{\varvec{\Sigma }}_{X_j{\varvec{y}}}\right) ^\top \widehat{\varvec{\Sigma }}_{\varvec{y}}^{-1}{\varvec{\Sigma }}_{{\varvec{y}}X_j}\\&I_3={\varvec{\Sigma }}_{X_j{\varvec{y}}}\left( \widehat{\varvec{\Sigma }}_{{\varvec{y}}}-{\varvec{\Sigma }}_{{\varvec{y}}}\right) {\varvec{\Sigma }}_{{\varvec{y}}X_j} \end{aligned}$$
The next proof is the same as the that in Li et al. (2017), we omit the details. Eventually, we have
$$\begin{aligned}&P\left\{ |{\hat{\rho }}_j^0-\rho _j^0|\ge \varepsilon \right\} \nonumber \\&\quad \le O\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -8\kappa }\varepsilon ^2\right) \right. \nonumber \\&\qquad \left. +6 n^{2 \kappa } \exp \left( -C_2 n^{1-2 \alpha -4 \kappa }\right) +\left( 4 n^{2 \kappa }+2 n^{\kappa }\right) A\exp \left( -s n^{\alpha }\right) \right) \end{aligned}$$
(A.5)
where \(C_1\) and \(C_2\) are two constants.
In the following, we prove the consistency of \(\sigma _j^{-2}\). Similarly to Lemma A.5, it has that
$$\begin{aligned} P\left( \left| {\hat{\sigma }}_j^2-\sigma _j^2\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} /2\right) +A \exp \left( -sn^{\alpha }\right) . \end{aligned}$$
It can be proved that
$$\begin{aligned}&P\left\{ \left| {\hat{\sigma }}_j^{-2}-\sigma _j^{-2}\right| \ge \varepsilon \right\} =P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon {\hat{\sigma }}_j^2\sigma _j^2)\\&\quad =P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon {\hat{\sigma }}_j^2\sigma _j^2,{\hat{\sigma }}_j^2\ge M)+P({\hat{\sigma }}_j^2< M)\\&\quad \le P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon M\sigma _j^2)+P({\hat{\sigma }}_j^2< M)\\&\quad \le P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon M\sigma _j^2)+P(|\sigma _j^2|-|{\hat{\sigma }}_j^2-\sigma _j^2|< M)\\&\quad =P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon M\sigma _j^2)+P(|{\hat{\sigma }}_j^2-\sigma _j^2| > M+\sigma _j^2)\\&\quad \le 2 \exp \left( -n^{1-2 \alpha } ( \varepsilon M\sigma _j^2)^{2} /2\right) +2 \exp \left( -n^{1-2 \alpha }(M+\sigma _j^2)^{2} /2\right) +2A \exp \left( -sn^{\alpha }\right) \end{aligned}$$
Let \(M=O(n^\xi )\), it has that
$$\begin{aligned} P\left\{ \left| {\hat{\sigma }}_j^{-2}-\sigma _j^{-2}\right| \ge \varepsilon \right\} \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi } \varepsilon ^2)\right) +2A \exp \left( -sn^{\alpha }\right) \end{aligned}$$
(A.6)
We now bound the item \(P(|{\hat{\rho }}_j-\rho _j|\ge \varepsilon )\). It has that
$$\begin{aligned}&P(|{\hat{\rho }}_j-\rho _j|\ge \varepsilon )=P(|{\hat{\rho }}_j^0{\hat{\sigma }}_j^{-2}-\rho _j^0\sigma _j^{-2}|\ge \varepsilon ) \nonumber \\&\quad \le P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|+|\sigma _j^{-2}({\hat{\rho }}_j^0-\sigma _j^{-2})|\ge \varepsilon ) \nonumber \\&\quad \le P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2)+P(|\sigma _j^{-2}({\hat{\rho }}_j^0-\sigma _j^{-2})|\ge \varepsilon /2) \end{aligned}$$
(A.7)
For the first item in A.7, we have
$$\begin{aligned}&P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2)\\&\quad =P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2, |{\hat{\rho }}_j^0|<M_1)+P(|{\hat{\rho }}_j^0|>M_1)\\&\quad \le P(|{\hat{\sigma }}_j^{-2}-\sigma _j^{-2}|>\varepsilon /(2M_1))+P(|{\hat{\rho }}_j^0-\rho _j^0|>M_1-|\rho _j^0|)\\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi } \varepsilon ^2/M_1^2)\right) +2A \exp \left( -sn^{\alpha }\right) \\&\qquad + O\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -8\kappa }(M_1-|\rho _j^0|)^2\right) \right. \\&\qquad \left. +6 n^{2 \kappa } \exp \left( -C_2 n^{1-2 \alpha -4 \kappa }\right) +\left( 4 n^{2 \kappa }+2 n^{\kappa }\right) A\exp \left( -s n^{\alpha }\right) \right) \end{aligned}$$
Let \(M_1=O(n^\gamma )\), it has that
$$\begin{aligned}&P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2)\\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi -2\gamma } \varepsilon ^2)\right) +\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8 \kappa }\varepsilon ^2\right) \right. \\&\qquad \left. +\left( 4 n^{2 \kappa }+2 n^{\kappa }+2\right) A\exp \left( -s n^{\alpha }\right) \right) \end{aligned}$$
Similarly, we can prove
$$\begin{aligned}&P(|\sigma _j^{-2}({\hat{\rho }}_j^0-\sigma _j^{-2})|\ge \varepsilon /2)\\&\quad \le (2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) +(4 n^{2 \kappa }+2 n^{\kappa }+2) A\exp \left( -s n^{\alpha }\right) \\&\qquad +2 \exp \left( -O(n^{1-2 \alpha -2\xi +2\gamma } \varepsilon ^2)\right) \end{aligned}$$
Consequently, it is easily obtained that
$$\begin{aligned}&P\left\{ \left| {\hat{\rho }}_j-\rho _j\right| \ge \varepsilon \right\} \\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi -2\gamma } \varepsilon ^2)\right) +\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) \right. \\&\qquad \left. +\left( 4 n^{2 \kappa }+2 n^{\kappa }\right) A\exp \left( -s n^{\alpha }\right) \right) + (2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) \\&\qquad +(4 n^{2 \kappa }+2 n^{\kappa }) A\exp \left( -s n^{\alpha }\right) +2 \exp \left( -O(n^{1-2 \alpha -2\xi +2\gamma } \varepsilon ^2)\right) \\&\qquad +2A \exp \left( -sn^{\alpha }\right) \end{aligned}$$
By some simple simplication, it has that
$$\begin{aligned}&P\left\{ \left| {\hat{\sigma }}_j^{-2}-\sigma _j^{-2}\right| \ge \varepsilon \right\} \\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi -2\gamma } \varepsilon ^2)\right) \\&\qquad \left. +\left( 8 n^{2 \kappa }+4 n^{\kappa }+2\right) A\exp \left( -s n^{\alpha }\right) \right) \\&\qquad + (2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) \end{aligned}$$
By setting \(\varepsilon =cn^{-\eta }\), it is easily obtained that
$$\begin{aligned}&P\left\{ \max _{j\in \widehat{{\mathcal {A}}}^\star }\left| {\hat{\rho }}_j-\rho _j\right| \ge cn^{-\eta } \right\} \\&\quad \le s_A\left\{ 2\exp \left( -O(c_\rho ^2n^{1-2 \alpha -2\xi -2\gamma -2\eta })\right) \right. \\&\qquad +(2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( - O(c_\rho ^2n^{1-2 \alpha -2\gamma -2\eta -8\kappa })\right) \\&\qquad \left. +\left( 8 n^{2 \kappa }+4 n^{\kappa }+4\right) A\exp \left( -s n^{\alpha }\right) \right\} , \end{aligned}$$
where \(s_A\) is the size of \(\widehat{{\mathcal {A}}}^\star \). \(\square \)
Proof of Theorem 3.3
By the Bayesian formula. It holds that
$$\begin{aligned} P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\right\} =P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\mid {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} P\left\{ {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} . \end{aligned}$$
On the one hand, \(P\left\{ {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \) has been bounded in Theorem 3.1, on the other hand, \(P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\mid {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \) can also be bounded like \(P\left\{ {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \) by using the same technique, namely, \(P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\mid {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \ge 1-s_{2n}P\left\{ \left| {\hat{\rho }}_j-\rho _j\right| \ge c_\rho n^{-\gamma } \right\} \). The proof are very similar to the proof of Theorem 3.1, and thus is omitted for simplicity. The sure screening of \(\widehat{{\mathcal {I}}}\) can be proved in a similar spirit. \(\square \)