[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Interaction screening via canonical correlation

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

A new canonical correlation (CC) based interaction screening procedure called CCIS is suggested for the ultrahigh dimensional interaction model with a multivariate response. The CCIS procedure consists of two steps: First, it selects a set of candidate features which has a large CC with the squared response; Then it recovers the influential main effects and interactions simultaneously from the reduced interaction model built by the features selected in the first step. CCIS has a ranking statistic with a simple structure, thus it can be calculated very quickly. More importantly, CCIS is powerful to detect the features which have a linear relationship with the response. Both theoretical results and numerical studies are provided to illustrate the effectiveness of CCIS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Chang J, Tang CY, Wu Y (2013) Marginal empirical likelihood and sure independence feature screening. Ann Stat 41:2123–2148

    Article  MathSciNet  Google Scholar 

  • Cordell Heather J (2009) Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet 10:392–404

    Article  Google Scholar 

  • Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B (Stat Methodol) 70:849–911

    Article  MathSciNet  Google Scholar 

  • Fan J, Song R et al (2010) Sure independence screening in generalized linear models with NP-dimensionality. Ann Stat 38:3567–3604

    Article  MathSciNet  Google Scholar 

  • Feng Y, Wu Y, Stefanski LA (2018) Nonparametric independence screening via favored smoothing bandwidth. J Stat Plan Inference 197:1–14

    Article  MathSciNet  Google Scholar 

  • Hall P, Xue JH (2014) On selecting interacting features from high-dimensional data. Comput Stat Data Anal 71:694–708

    Article  MathSciNet  Google Scholar 

  • Hao N, Zhang HH (2014) Interaction screening for ultrahigh-dimensional data. J Am Stat Assoc 109:1285–1301

    Article  MathSciNet  Google Scholar 

  • Kong Y, Li D, Fan Y, Lv J (2017) Interaction pursuit in high-dimensional multi-response regression via distance correlation. Ann Stat 45:897–922

    Article  MathSciNet  Google Scholar 

  • Li R, Zhong W, Zhu L (2012) Feature screening via distance correlation learning. J Am Stat Assoc 107:1129–1139

    Article  MathSciNet  Google Scholar 

  • Li X, Cheng G, Wang L, Lai P, Song F (2017) Ultrahigh dimensional feature screening via projection. Comput Stat Data Anal 114:88–104

    Article  MathSciNet  Google Scholar 

  • Liu J, Li R, Wu R (2014) Feature selection for varying coefficient models with ultrahigh dimensional covariates. J Am Stat Assoc 109:266–274

    Article  MathSciNet  Google Scholar 

  • Lu J, Lin L (2018) Feature screening for multi-response varying coefficient models with ultrahigh dimensional predictors. Comput Stat Data Anal 128:242–254

    Article  MathSciNet  Google Scholar 

  • Jun Lu, Lin Lu (2020) Model-free conditional screening via conditional distance correlation. Stat Pap 61:225–244

    Article  MathSciNet  Google Scholar 

  • Luo S, Chen Z (2020) Feature selection by canonical correlation search in high-dimensional multiresponse models with complex group structures. J Am Stat Assoc 115:1227–1235

    Article  MathSciNet  Google Scholar 

  • Mai Q, Zou H et al (2015) The fused Kolmogorov filter: a nonparametric model-free screening method. Ann Stat 43:1471–1497

    Article  MathSciNet  Google Scholar 

  • Pan W, Wang X, Xiao W, Zhu H (2019) A generic sure independence screening procedure. J Am Stat Assoc 114:928–937

    Article  MathSciNet  Google Scholar 

  • Song Y, Zhu X, Lin L (2014) Independent feature screening for ultrahigh-dimensional models with interactions. J Korean Stat Soc 43:567–583

    Article  MathSciNet  Google Scholar 

  • Thompson B (1984) Canonical correlation analysis: uses and interpretation. Sage, Beverly Hills

    Book  Google Scholar 

  • Wang X, Leng C (2016) High-dimensional ordinary least-squares projection for screening variables. J R Stat Soc Ser B (Stat Methodol) 78:589–611

    Article  MathSciNet  Google Scholar 

  • Zhu L, Li L, Li R, Zhu L (2011) Model-free feature screening for ultrahigh dimensional data. J Am Stat Assoc 106:1464–1475

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dan Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jun Lu was supported by National Natural Science Foundation of China (No.12001486) and China Postdoctoral Foundation (2020M671782).

Appendix

Appendix

1.1 Appendix A.1: Some lemmas

Before proving the main result, we introduce some lemmas, which will be used several times in our proof.

Lemma A.1

(Hoeffding inequality) Let \(X_{1}, \ldots , X_{n}\) be a series of independent random variables. Assume that \(P\left( X_{i} \in \right. \) \(\left. \left[ a_{i}, b_{i}\right] \right) =1\) for \(1 \le i \le n,\) where \(a_{i}\) and \(b_{i}\) are constants. Let \({\bar{X}}=n^{-1} \sum _{i=1}^{n} X_{i} .\) Then the following inequality holds

$$\begin{aligned} P(|{\bar{X}}-E({\bar{X}})| \ge t) \le 2 \exp \left( -\frac{2 n^{2} t^{2}}{\sum _{i=1}^{n}\left( b_{i}-a_{i}\right) ^{2}}\right) . \end{aligned}$$

Lemma A.2

Suppose a random variable U satisfying the sub-exponential condition that \(E\exp (sU)<\infty \) for some constant \(s>0\), and \(U_1,\ldots ,U_n\) are i.i.d samples from U. Then, for any \(\varepsilon >0,\) there exist some positive constants \(0<\alpha <1 / 2, c_0\) and t such that

$$\begin{aligned} P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U)\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +c_0 n\exp \left( -s n^{\alpha }\right) \end{aligned}$$
(A.1)

Proof

For any \(M>0\),

$$\begin{aligned}&P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon \right) \\&\quad =P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| \le M\right) \\&\qquad +P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| >M\right) \\&\quad \le P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| \le M\right) +P\left( |U_i| \ge M \text{ for } \text{ some } i\right) \\&\quad \le P\left( \left| \frac{1}{n}\sum _{i=1}^n U_i-E(U) \right| \ge \varepsilon , \max |U_i| \le M\right) +n P\left( |U_i| \ge M\right) \\&\quad \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) + c_0 n\exp \left( -s n^{\alpha }\right) , \end{aligned}$$

where the last inequality holds because of the hoeffding inequality and the fact \(P\left( \left| U_i\right| \ge M\right) \le 2E\exp (sU) \exp (-sM),\) which is deduced by the Markov inequality. Setting \(M=n^{\alpha }, 0<\alpha <1/2\) and \(c_0=E\exp (sU)\), (A.1) is proved. \(\square \)

Lemma A.3

By Condition (C1), we have that

$$\begin{aligned}&P\left( \left| \widehat{\text{ Cov }}(X_j^2,Y_k^2)-\text{ Cov }(X_j^2,Y_k^2))\right| \ge \varepsilon \right) \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) , \end{aligned}$$

and

$$\begin{aligned}&P\left( \left| \widehat{\text{ Cov }}(Y_j^2,Y_k^2)-\text{ Cov }(Y_j^2,Y_k^2))\right| \ge \varepsilon \right) \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$

Proof

If we set \(U=X_j^2Y_k^2\) for \(j\in \{1,\ldots ,p\}\) and \(k\in \{1,\ldots ,q\}\), it has that

$$\begin{aligned} P\left( \left| {\widehat{E}}(X_j^2Y_k^2)-E(X_j^2Y_k^2)\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$

By simple algebra, it has that

$$\begin{aligned} {\widehat{E}}(X_j^2){\widehat{E}}(Y_k^2)-E(X_j^2){\widehat{E}}(Y_k^2)&={\widehat{E}}(X_j^2)\{{\widehat{E}}(Y_k^2)-E(Y_k^2)\}\\&\quad +\{{\widehat{E}}(X_j^2)-E(X_j^2)\}{\widehat{E}}(Y_k^2). \end{aligned}$$

As a result,

$$\begin{aligned}&P\left( \left| {\widehat{E}}(X_j^2){\widehat{E}}(Y_k^2)-E(X_j^2){\widehat{E}}(Y_k^2)\right| \ge \varepsilon \right) \\&\quad \le P\left( \left| {\widehat{E}}(X_j^2)\{{\widehat{E}}(Y_k^2)-E(Y_k^2)\}\right| \ge \varepsilon /2\right) \\&\qquad +P\left( \left| \{{\widehat{E}}(X_j^2)-E(X_j^2)\}{\widehat{E}}(Y_k^2)\right| \ge \varepsilon /2\right) \\&\quad \ge 2 \exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 8\right) +2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 8\right) +2c_0 n\exp \left( -s n^{\alpha /2}\right) , \end{aligned}$$

where \(c_1\) and \(c_2\) are constants depending on \(E(X_j^2)\) and \(E(Y_k^2)\), respectively. Consequently, it can be proved that

$$\begin{aligned}&P\left( \left| \widehat{\text{ Cov }}(X_j^2,Y_k^2)-\text{ Cov }(X_j^2,Y_k^2))\right| \ge \varepsilon \right) \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$

\(\square \)

Lemma A.4

Let \(\varvec{D}=\widehat{\varvec{\Sigma }}_{\widetilde{y}}-\varvec{\Sigma }_{\widetilde{y}}\), then it has that

$$\begin{aligned} \left| \lambda _{\min }(\widehat{\varvec{\Sigma }}_{\widetilde{y}})-\lambda _{\min }({\varvec{\Sigma }}_{\widetilde{y}})\right| \le q\Vert \varvec{D}\Vert _{\infty }. \end{aligned}$$

The proof can be referred to Lemma 4 in Li et al. (2017). The following lemma is also from Li et al. (2017), which can be easily proved by the Hoeffding inequality and Markov inequality and some matrix theories.

Lemma A.5

For any \(\varepsilon >0\),

$$\begin{aligned} P\left( \left| \lambda _{\min }\left( \widehat{\varvec{\Sigma }}_{\varvec{y}}\right) -\lambda _{\min }\left( {\varvec{\Sigma }}_{\varvec{y}}\right) \right| \ge \varepsilon \right) \le q_{n}^{2}\left( 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2 q_{n}^{2}\right) +A \exp \left( -sn^{\alpha }\right) \right) \end{aligned}$$

In addition, there exist some positive constants \(c_{2}\) and \(c_{3}\) such that

$$\begin{aligned} P\left\{ \left| \Vert \widehat{\varvec{\Sigma }}_{\varvec{y}}^{-1}\Vert -\Vert {\varvec{\Sigma }}_{\varvec{y}}^{-1}\Vert \right| \ge c_{3}\Vert \varvec{\Sigma }_{\varvec{y}}^{-1}\Vert \right\} \le q_{n}^{2}\left( 2 \exp \left( -c_{2} n^{1-2 \alpha } q_{n}^{-4}\right) + A \exp \left( -sn^{\alpha }\right) \right) , \end{aligned}$$

where \(\Vert \varvec{B}\Vert =\lambda _{{\textit{max}}}(\varvec{B}^\top \varvec{B})\) be the maximal eigenvalue operator for matrix \(\varvec{B}^\top \varvec{B}\) for any matrix \(\varvec{B}\).

1.2 Appendix A.2: Proof of Theorem 3.1

To prove the consistency of \(\widehat{\varvec{\Phi }}_j\), we need to make a decomposition on \(\widehat{\varvec{\Phi }}_j\). Note that \(\varvec{\Sigma }_{\widetilde{\varvec{x}}_j}\) is a \(2\times 2\) matrix, denoted as

$$\begin{aligned} \varvec{\Sigma }_{\widetilde{\varvec{x}}_j}=\left( \begin{array}{cc}\sigma ^2_{j,11}&{}\quad \sigma ^2_{j,12}\\ \sigma ^2_{j,21}&{}\quad \sigma ^2_{j,22}\end{array}\right) , \end{aligned}$$

where \(\sigma ^2_{j,11}=\text{ Cov }(X_j^2,X_j^2)\), \(\sigma ^2_{j,12}=\text{ Cov }(X_j^2,X_j)\) and \(\sigma ^2_{j,22}=\text{ Cov }(X_j,X_j)\). Let \(\varvec{s}_j\) be the j-th column of \(\varvec{\Sigma }_{\widetilde{\varvec{x}}_j\widetilde{\varvec{y}}}\), then

$$\begin{aligned} {\varvec{\Phi }}_j= & {} \left( \varvec{\Sigma }_{\widetilde{\varvec{x}}_j}\right) ^{-1} \varvec{\Sigma }_{\widetilde{\varvec{x}}_j\widetilde{\varvec{y}}} \left( \varvec{\Sigma }_{\widetilde{\varvec{y}}}\right) ^{-1} \varvec{\Sigma }_{\widetilde{\varvec{y}}\widetilde{\varvec{x}}_j}\\= & {} \left( \begin{array}{cc} \sigma ^2_{j,11}&{}\quad \sigma ^2_{j,12}\\ \sigma ^2_{j,21}&{}\quad \sigma ^2_{j,22}\end{array}\right) ^{-1} \left( \begin{array}{c}\varvec{s}_1^\top \\ \varvec{s}_2^\top \end{array}\right) \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\left( \varvec{s}_1, \varvec{s}_2\right) \\= & {} \left( \begin{array}{cc}\gamma _{j,11}&{}\quad \gamma _{j,12}\\ \gamma _{j,21}&{}\quad \gamma _{j,22}\end{array}\right) \left( \begin{array}{cc} \varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1 &{}\quad \varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2 \\ \varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1 &{}\quad \varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2 \end{array}\right) \\= & {} \left( \begin{array}{cc} \underbrace{\gamma _{j,11}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1+\gamma _{j,12}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1}_{\phi _j^{11}}&{}\quad \underbrace{\gamma _{j,11}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2+\gamma _{j,12}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2}_{\phi _j^{12}}\\ \underbrace{\gamma _{j,21}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1+\gamma _{j,22}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_1}_{\phi _j^{21}}&{} \quad \underbrace{\gamma _{j,21}\varvec{s}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2+\gamma _{j,22}\varvec{s}_2^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\varvec{s}_2}_{\phi _j^{22}} \end{array}\right) \end{aligned}$$

Similarly, we can decompose the sample analogy \(\widehat{\varvec{\Phi }}_j\) of \(\varvec{\Phi }_j\) as

$$\begin{aligned} \widehat{\varvec{\Phi }}_j= & {} \left( \begin{array}{cc} {\hat{\phi }}_j^{11}&{}\quad {\hat{\phi }}_j^{12}\\ {\hat{\phi }}_j^{21}&{}\quad {\hat{\phi }}_j^{22} \end{array}\right) \\= & {} \left( \begin{array}{cc} {\hat{\gamma }}_{j,11}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1+\gamma _{j,12}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1&{}\quad {\hat{\gamma }}_{j,11}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2+\gamma _{j,12}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2\\ {\hat{\gamma }}_{j,21}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1+\gamma _{j,22}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1&{}\quad {\hat{\gamma }}_{j,21}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2+\gamma _{j,22}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_2 \end{array}\right) . \end{aligned}$$

Since

$$\begin{aligned} \omega _j=\frac{1}{2}\left( {\hat{\phi }}_j^{11}+{\hat{\phi }}_j^{22} +\sqrt{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2 +4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}}\right) , \end{aligned}$$

where \(\phi _j^{11}\) is the sample version of \(\phi _j^{11}\), similar definitions go to \(\phi _j^{12}, \phi _j^{21}\) and \(\phi _j^{22}\).

We first prove the consistency for \({\hat{\phi }}_j^{11}={\hat{\gamma }}_{j,11}\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1+{\hat{\gamma }}_{j,12}\hat{\varvec{s}}_2^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1\). It is sufficient to bound the first item in \({\hat{\phi }}_j^{11}\) because the other item shares the same asymptotic behavior. The proof takes a very similar spirit to Li et al. (2017), here we only give the outline of the proof. Denote \(\Vert \varvec{s}\Vert \) as the Euclidean norm of a vector \(\varvec{s}\). For notational simplicity, let \(\varvec{A}=\varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}\) and \(\widehat{\varvec{A}}=\widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\). By some simple algebra, it has that

$$\begin{aligned}&\hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2-\varvec{s}_1^\top \varvec{A}\varvec{s}_2\\&\quad = \underbrace{(\hat{\varvec{s}}_1-{\varvec{s}}_1)^\top \widehat{\varvec{A}}(\hat{\varvec{s}}_2-{\varvec{s}}_2)}_{I_{11}} +\underbrace{(\hat{\varvec{s}}_1-{\varvec{s}}_1)^\top \widehat{\varvec{A}}{\varvec{s}}_2}_{I_{12}}+ \underbrace{{\varvec{s}}_1^\top \widehat{\varvec{A}}(\hat{\varvec{s}}_2-{\varvec{s}}_2)}_{I_{13}}+ \underbrace{\varvec{s}_1^\top (\widehat{\varvec{A}}-\varvec{A})\varvec{s}_2}_{I_{14}} \end{aligned}$$

We then bound \(I_{1i}\) for \(i=1,\ldots ,4\) one by one. For the first item \(I_{11}\), note that

$$\begin{aligned} |I_{11}|\le \Vert \widehat{\varvec{A}}\Vert \Vert \hat{\varvec{s}}_1-\varvec{s}_1\Vert \Vert \hat{\varvec{s}}_2-\varvec{s}_2\Vert \end{aligned}$$

By Lemma A.3, it has that

$$\begin{aligned}&P\left( \Vert \hat{\varvec{s}}_1-\varvec{s}_1\Vert ^2 \ge q\varepsilon ^2 \right) \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \nonumber \\&\quad + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) +2c_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$
(A.2)

Similarly, the probability bound also holds for item \(\hat{\varvec{s}}_2-\varvec{s}_2\). We next bound the matrix \(\widehat{\varvec{A}}\), for some constant \(c_3\), it has that

$$\begin{aligned}&P \left\{ \left| \Vert \widehat{\varvec{A}}\Vert -\Vert \varvec{A}\Vert \right| \ge c_{3}\Vert \varvec{A}\Vert \right\} \\&\quad = P\left\{ \left| \lambda _{\max }(\widehat{\varvec{A}})-\lambda _{\max }(\varvec{A})\right| \ge c_3 \lambda _{\max }(\varvec{A})\right\} \\&\quad \le P\left\{ \left| \lambda ^{-1}_{\min }(\widehat{\varvec{A}}^{-1})-\lambda ^{-1}_{\min }({\varvec{A}}^{-1})\right| \ge c_{3}\lambda ^{-1}_{\min }(\varvec{A}^{-1})\right\} \\&\quad \le P\left\{ \left| \lambda _{\min }(\widehat{\varvec{A}}^{-1})-\lambda _{\min }({\varvec{A}}^{-1})\right| \ge c_3\lambda _{\min }(\varvec{A}^{-1})\right\} \\&\quad \le P\left\{ \left| \lambda _{\min }(\widehat{\varvec{A}}^{-1})-\lambda _{\min }({\varvec{A}}^{-1})\right| \ge c_3C_1q^{-1}\right\} \\&\quad \le P\left( q\Vert \varvec{D}\Vert _\infty \ge c_3C_1q^{-1}\right) \\&\quad \le 2q^2\left\{ \exp \left( -n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 8\right) +\exp \left( -c_1n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 32\right) \right. \\&\qquad + \left. \exp \left( -c_2n^{1-2 \alpha }c_3^2C_1^2q^{-4} / 32\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) \right\} , \end{aligned}$$

where the last second inequality holds because of Lemma A.4.

By condition (C2), it follows that

$$\begin{aligned}&P\left\{ \Vert \widehat{\varvec{A}}\Vert \ge \left( c_{3}+1\right) C_2 q^{-1}\right\} \le P\left\{ \Vert \widehat{\varvec{A}}\Vert \ge \left( c_{3}+1\right) \Vert \varvec{A}\Vert \right\} \nonumber \\&\quad =P\left\{ \Vert \widehat{\varvec{A}}\Vert -\Vert \varvec{A}\Vert \ge c_{3}\Vert \varvec{A}\Vert \right\} \le P\left\{ \left| \Vert \widehat{\varvec{A}}\Vert -\Vert \varvec{A}\Vert \right| \ge c_{3}\Vert \varvec{A}\Vert \right\} \nonumber \\&\quad \le 2q^2\left\{ \exp \left( -n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 8\right) +\exp \left( -c_1n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 32\right) \right. \nonumber \\&\qquad + \left. \exp \left( -c_2n^{1-2 \alpha }c_3^2C_1^2q^{-4} / 32\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) \right\} . \end{aligned}$$
(A.3)

Consequently,

$$\begin{aligned}&P\left\{ \left| I_{11}\right| \ge \left( c_{3}+1\right) C_2\varepsilon ^{2}\right\} \\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1n^{1-2 \alpha } \varepsilon ^{2} / 32\right) + 2 \exp \left( -c_2n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \\&\qquad +2c_0 n\exp \left( -s n^{\alpha /2}\right) + 2q^2\left\{ \exp \left( -n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 8\right) \right. \\&\qquad +\exp \left( -c_1n^{1-2 \alpha } c_3^2C_1^2q^{-4} / 32\right) \\&\qquad + \left. \exp \left( -c_2n^{1-2 \alpha }c_3^2C_1^2q^{-4} / 32\right) +c_0 n\exp \left( -s n^{\alpha /2}\right) \right\} . \end{aligned}$$

Or equivalently,

$$\begin{aligned}&P\left\{ \left| I_{11}\right| \ge \varepsilon \right\} \\&\quad \le 2\exp \left( O\left\{ -n^{1-2 \alpha } \varepsilon ^{2}\right\} \right) + 2q^2 \exp \left( -O\left\{ n^{1-2 \alpha }q^{-4}\right\} \right) +c_0n\exp \left( -s n^{\alpha /2}\right) \end{aligned}$$

where \(B_1\) and \(B_2\) are two constants depending on \(c_1,c_2,c_3,C_1\) and \(C_2\).

We then prove the convergence rate for the second term \(I_{12}\). It has that

$$\begin{aligned} |I_{12}|\le \Vert \hat{\varvec{s}}_1-\varvec{s}\Vert \Vert \widehat{\varvec{A}}\varvec{s}_2\Vert . \end{aligned}$$

For \(\Vert \widehat{\varvec{A}}\varvec{s}_2\Vert \), by triangle inequality, it follows that

$$\begin{aligned} \Vert \widehat{\varvec{A}}\varvec{s}_2\Vert \le \Vert (\widehat{\varvec{A}}-\varvec{A})\varvec{s}_2 \Vert + \Vert \varvec{A}\varvec{s}_2\Vert \le \Vert \widehat{\varvec{A}}-\varvec{A}\Vert \Vert \varvec{s}_2 \Vert + \Vert \varvec{A}\varvec{s}_2\Vert . \end{aligned}$$

For \(\Vert \widehat{\varvec{A}}-\varvec{A}\Vert \Vert \varvec{s}_2 \Vert \), it has that

$$\begin{aligned} \Vert \widehat{\varvec{A}}-\varvec{A}\Vert =\Vert \widehat{\varvec{A}}\varvec{A}(\varvec{A}^{-1}-\widehat{\varvec{A}}^{-1})\Vert \le \Vert \widehat{\varvec{A}}\Vert \Vert \varvec{A}\Vert \Vert \varvec{A}^{-1}-\widehat{\varvec{A}}^{-1}\Vert \le q^2C_1\Vert \widehat{\varvec{A}}\Vert \Vert \varvec{D}\Vert , \end{aligned}$$

where the last inequality holds because \(\Vert \varvec{A}\Vert =\lambda _{\min }^{-1}(\varvec{A}^{-1})\le (C_1/q)^{-1}\) and \(\Vert \varvec{A}^{-1}-\widehat{\varvec{A}}^{-1}\Vert \le q\Vert \varvec{D}\Vert \). As \(\Vert \varvec{D}\Vert _\infty \) converges to zero by the central limit theorem and the boundedness of \(\Vert \widehat{\varvec{A}}\Vert \) proved by (A.3), \(\Vert \widehat{\varvec{A}}-\varvec{A}\Vert \Vert \varvec{s}_2 \Vert \) is negligible as \(n\rightarrow \infty \). Thus \(\Vert \varvec{A}\varvec{s}_2\Vert =\Vert \varvec{A}\Vert \Vert \varvec{s}_2\Vert =\Vert \varvec{s}_2\Vert /\lambda _{\min }(\varvec{A}^{-1})\le \Vert \varvec{s}_2\Vert q/C_1\) controls the convergence rate of \(\Vert \widehat{\varvec{A}}\varvec{s}_2\Vert \). It follows that \(\Vert \varvec{A}\varvec{s}_2\Vert ^2 \le \Vert \varvec{s}_2\Vert ^2q^2/C^2_1\le q^3C_1^2\Vert \varvec{s}_2\Vert ^2_\infty \). Note that \(\Vert \varvec{s}_2\Vert ^2_\infty \) is finite under Condition (C1) in Section 3 as it an expectation. Finally, the above results, together with (A.2), it has that

$$\begin{aligned} P\left( |I_{12}| \ge \varepsilon \right)\le & {} P\left( \Vert \widehat{\varvec{s}}_2-\varvec{s}_2\Vert ^2 \ge q^{-2}\varepsilon ^2C_1^2/\Vert \varvec{s}\Vert ^2_\infty \right) \nonumber \\\le & {} 2\exp \left( -C_1^2q^{-4}n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -c_1C_1^2q^{-4}n^{1-2 \alpha } \varepsilon ^{2} / 32\right) \nonumber \\&+ 2 \exp \left( -c_2C_1^2q^{-4}n^{1-2 \alpha } \varepsilon ^{2} / 32\right) +2c_0 n\exp \left( -s n^{\alpha /2}\right) \nonumber \\= & {} O\{\exp \left( -O\{q^{-4}n^{1-2 \alpha } \varepsilon ^{2}\}\right) +n\exp \left( -s n^{\alpha /2} \right) \} \end{aligned}$$
(A.4)

The item \(I_{13}\) and \(I_{14}\) have the same asymptotic behavior as \(I_{12}\), we thus omit the details here for space saving. Consequently,

$$\begin{aligned}&P\left\{ \left| \hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2-\varvec{s}_1^\top \varvec{A}\varvec{s}_2\right| \ge \varepsilon \right\} \\&\quad \le n\exp \left( -s n^{\alpha /2}\right) + 2q^2 \exp \left( -O\{n^{1-2 \alpha }q^{-4}\}\right) +\exp \left( -O\{q^{-4}n^{1-2 \alpha } \varepsilon ^{2}\}\right) . \end{aligned}$$

Note that when aggregating the boundedness of \(P\left( |I_{11}| \ge \varepsilon \right) \) and \(P\left( |I_{12}| \ge \varepsilon \right) \), we delete the negligible term which does not control the convergence rate of \(\hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2\) .

Next, we prove the consistency of \({\hat{\gamma }}_{j,12}\). For simplicity, denote \({{\hat{g}}}_j={\hat{\sigma }}^2_{j,11}{\hat{\sigma }}^2_{j,22}-{\hat{\sigma }}_{j,12}^2{\hat{\sigma }}_{j,21}^2\), and \(g_j=\sigma ^2_{j,11}\sigma ^2_{j,22}-\sigma _{j,12}^2\sigma _{j,21}^2\). So, it can be derived that

$$\begin{aligned}&P(|{\hat{\gamma }}_{j,12}-\gamma _{j,12}|\ge \varepsilon ) =P\left( \frac{1}{{{\hat{g}}}_j}\left| \frac{g_j{\hat{\sigma }}_{j,12}^2-{{\hat{g}}}_j\sigma _{j,12}^2}{g_j}\right| \ge \varepsilon \right) \\&\quad \le P\left( \frac{1}{{{\hat{g}}}_j}\left| \frac{g_j{\hat{\sigma }}_{j,12}^2-{{\hat{g}}}_j\sigma _{j,12}^2}{g_j}\right| \ge \varepsilon , {{\hat{g}}}_j\le N\right) \\&\qquad +P\left( \frac{1}{{{\hat{g}}}_j}\left| \frac{g_j{\hat{\sigma }}_{j,22}^2-{{\hat{g}}}_j\sigma _{j,22}^2}{g_j}\right| \ge \varepsilon , {{\hat{g}}}_j> N\right) \\&\quad \le P\left( \left| {g_j{\hat{\sigma }}_{j,12}^2-{{\hat{g}}}_j\sigma _{j,12}^2}\right| \ge {g_j}\varepsilon /N\right) +P({{\hat{g}}}_j>N)\\&\quad \le P\left( \left| {g_j({\hat{\sigma }}_{j,12}^2-\sigma _{j,12}^2)|+|\sigma _{j,12}^2({{\hat{g}}}_j-g_j)}\right| \ge {g_j}\varepsilon /N\right) +P({{\hat{g}}}_j>N) \end{aligned}$$

By similar proving technique to the proof of Lemma A.3, we can get a similar upper bound for the probability \(P(|{\hat{\gamma }}_j-\gamma _j|\ge \varepsilon )\), namely, there exists some constant \(b_0\), \(b_1\) and \(b_2\) such that

$$\begin{aligned}&P(|{\hat{\gamma }}_{j,12}-\gamma _{j,12}|\ge \varepsilon )\\&\quad \le 2\exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 8\right) + 2\exp \left( -b_1n^{1-2 \alpha } \varepsilon ^{2}\right) + 2 \exp \left( -b_2n^{1-2 \alpha } \varepsilon ^{2}\right) \\&\qquad +2b_0 n\exp \left( -s n^{\alpha /2}\right) \\&\quad =\exp \left( -O\{n^{1-2 \alpha } \varepsilon ^{2}\}\right) +2b_0 n\exp \left( -s n^{\alpha /2}\right) \end{aligned}$$

For simplicity, let \({\hat{\delta }}=\hat{\varvec{s}}_1^\top \widehat{\varvec{\Sigma }}_{\widetilde{\varvec{y}}}^{-1}\hat{\varvec{s}}_1\) and \(\delta ={\varvec{s}}_1^\top \varvec{\Sigma }_{\widetilde{\varvec{y}}}^{-1}{\varvec{s}}_1\), then

$$\begin{aligned} {\hat{\gamma }}_{j,11}{\hat{\delta }}-\gamma _{j,11}\delta =\underbrace{{\hat{\gamma }}_{j,11}({\hat{\delta }}-\delta )}_{J_1}+\underbrace{({\hat{\gamma }}_{j,11}-\gamma _{j,11})\delta }_{J_2}. \end{aligned}$$

Upper bound of \(J_2\) has the same order as \(|{\hat{\gamma }}_{j,11}-\gamma _{j,11}|\) because \(\delta \) is bounded. The bound of \(J_1\) is also easily obtained because it follows that

$$\begin{aligned} P(|{\hat{\gamma }}_{j,11}({\hat{\delta }}-\delta )|\ge \varepsilon )\le P(|{\hat{\delta }}-\delta |\ge \varepsilon /H)+P(|{\hat{\gamma }}_{j,11}|\ge H), \end{aligned}$$

where the second term is negligible if we set \(H=n^\xi \). Consequently,

$$\begin{aligned}&P\left\{ \left| \hat{\varvec{s}}_1^\top \widehat{\varvec{A}}\hat{\varvec{s}}_2-\varvec{s}_1^\top \varvec{A}\varvec{s}_2\right| \ge \varepsilon \right\} \\&\quad \le n\exp \left( -s n^{\alpha /2}\right) + 2q^2 \exp \left( -O\{n^{1-2 \alpha }q^{-4}\}\right) +\exp \left( -O\{q^{-4}n^{1-2 \alpha -2\xi } \varepsilon ^{2}\}\right) . \end{aligned}$$

By taking similar argument, the asymptotic behavior of \(\hat{\phi }_j^{22}\) and \(\hat{\phi }_j^{12}\) can also be obtained.

Finally, we prove the consistency of \({\hat{\omega }}_j\). Since

$$\begin{aligned}&\sqrt{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2+4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}}-\sqrt{(\phi _j^{11}-\phi _j^{22})^2+4\phi _j^{12}\phi _j^{21}}\\&\quad =\frac{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2+4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}-(\phi _j^{11}-\phi _j^{22})^2+4\phi _j^{12}\phi _j^{21}}{\sqrt{({\hat{\phi }}_j^{11}-{\hat{\phi }}_j^{22})^2+4{\hat{\phi }}_j^{12}{\hat{\phi }}_j^{21}}+\sqrt{(\phi _j^{11}-\phi _j^{22})^2+4\phi _j^{12}\phi _j^{21}}}. \end{aligned}$$

Thus it is sufficient to prove the consistency of the numerator as the denominator is bounded in probability. It is not difficult to show that the numerator shares the same convergence rate as \({\hat{\phi }}_j^{11}\), here we omit the details for space saving. In summary, it can be shown \({\hat{\omega }}_j\) has the same convergence rate as \({\hat{\phi }}_j^{11}\). Consequently, it has that

$$\begin{aligned}&P\left\{ |{\hat{\omega }}_j-\omega _j| \ge \varepsilon \right\} \\&\quad \le n\exp \left( -s n^{\alpha /2}\right) + 2q^2 \exp \left( -O\{n^{1-2 \alpha }q^{-4}\}\right) +\exp \left( -O\{q^{-4}n^{1-2 \alpha -2\xi } \varepsilon ^{2}\}\right) \\&\qquad +\exp \left( O\{-n^{1-2 \alpha } \varepsilon ^{2}\}\right) +2b_0 n\exp \left( -s n^{\alpha /2}\right) . \end{aligned}$$

Let \(\varepsilon =c_\omega n^\tau \) and the condition \(q=O(n^\kappa )\), then it follows that

$$\begin{aligned}&P\left\{ \max _{1\le j\le p}\left| {\hat{\omega }}_j-\omega _j\right| \ge c_\omega n^{-\eta } \right\} \le \sum _{j=1}^p P\left\{ \left| {\hat{\omega }}_j-\omega _j\right| \ge c_\omega n^{-\eta } \right\} \\&\quad \le p\left( O\{n\exp \left( -s n^{\alpha /2}\right) \} + 2n^{2\kappa } \exp \left( -O\{n^{1-2 \alpha -4\kappa }\}\right) \right. \\&\qquad \left. + \exp \left( -O\{c_\omega ^2 n^{1-2\alpha -2\tau -2\xi -4\kappa }\}\right) \right) \end{aligned}$$

Now, we prove the second part of the theorem as follows. If \({\mathcal {A}} \subsetneq \widehat{{\mathcal {A}}}\), then there must exist some \(k\in {\mathcal {A}}\) such that \({\hat{\omega }}_k<c_\omega n^{-\eta }\). It follows from Condition (C4) that \(|{\hat{\omega }}_k-\omega _k|>(c_\omega '-c_\omega )n^{-\eta }\) for some \(k\in {\mathcal {A}}\), indicating that the events satisfy \({\mathcal {E}}_n=\left\{ {\mathcal {A}}\subsetneq \widehat{{\mathcal {A}}}\right\} \subset \{|{\hat{\omega }}_k-\omega _k|>(c_\omega '-c_\omega )n^{-\eta }, \text{ for } \text{ some } k\in {\mathcal {A}}\}\). Consequently,

$$\begin{aligned} \begin{aligned}&P({\mathcal {A}} \subset \widehat{{\mathcal {A}}}) \ge P({\mathcal {E}})=1-P({\mathcal {E}})\\&\quad = 1-P\left( \max _{j \in {\mathcal {A}}}\left| \omega _{j}-\hat{\omega }_{j}\right| \ge (c_\omega '-c_\omega ) n^{-\eta }\right) \ge 1-s_{1n}P\left( \left| \omega _{j}-\hat{\omega }_{j}\right| \ge b_\omega n^{-\eta }\right) \\&\quad \ge 1-s_{1n}\left( O\{n\exp \left( -s n^{\alpha /2}\right) \} + 2n^{2\kappa } \exp \left( -O\{n^{1-2 \alpha -4\kappa }\}\right) \right. \\&\qquad \left. + \exp \left( -O\{b_\omega ^2 n^{1-2\alpha -2\tau -2\xi -4\kappa }\}\right) \right) , \end{aligned} \end{aligned}$$

where \(b_\omega =c_\omega '-c_\omega \). \(\square \)

Proof of Corollary 3.1

Define \( \delta =\min _{j \in {\mathcal {A}}} \omega _{j}-\max _{j \in {\mathcal {A}}^{c}} \omega _{j} \), then under condition \((\mathrm {C} 5),\) it has that

$$\begin{aligned}&P\left( \min _{j \in {\mathcal {A}}} \hat{\omega }_{j} \le \max _{j \in {\mathcal {A}}^{c}} \hat{\omega }_{j}\right) \\&\quad = P\left( \min _{j \in {\mathcal {A}}} \hat{\omega }_{j}-\min _{j \in {\mathcal {A}}} \omega _{j}+\delta \le \max _{j \in {\mathcal {A}}^{c}} \hat{\omega }_{j}-\max _{j \in {\mathcal {A}}^{c}} \omega _{j}\right) \\&\quad \le P\left( \max _{j \in {\mathcal {A}}^{c}}\left| \hat{\omega }_{j}-\omega _{j}\right| +\max _{j \in {\mathcal {A}}}\left| \hat{\omega }_{j}-\omega _{j}\right| \ge \delta \right) \\&\quad \le P\left( \max _{1 \le j \le p}\left| \hat{\omega }_{j}-\omega _{j}\right| \ge \delta / 2\right) \le \sum _{j=1}^{p} P\left( \left| \hat{\omega }_{j}-\omega _{j}\right| \ge \delta / 2\right) . \end{aligned}$$

The last term goes to 0 as \(n \rightarrow \infty \) when p satisfies some condition. \(\square \)

Proof of Theorem 3.2

We only prove the consistency of \({\hat{\rho }}_j\) as the consistency for \({\hat{\pi }}_{kl}\) is similar to prove.

By setting \(U_i=X_{ij}Y_{ik}\), \(U_i=Y_{ik}Y_{il}\), respectively, it is easily proved by Lemma A.3 that

$$\begin{aligned} P\left( \left| \frac{1}{n}\sum _{i=1}^n X_{ij}Y_{ik} -E (X_jY_k)\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +A_1 \exp \left( -s n^{\alpha }\right) \end{aligned}$$

Similarly, we can bounded \(P\left( \left| \frac{1}{n}\sum _{i=1}^n Y_{ij} -E (Y_j)\right| \ge \varepsilon \right) \) and \(P\left( \left| \frac{1}{n}\sum _{i=1}^n X_{ij}\right. \right. \left. \left. -E (X_j)\right| \ge \varepsilon \right) \). Thus, it can be proved that there exists some constant A and s such that

$$\begin{aligned} P\left( \left| \frac{1}{n}\sum _{i=1}^n \widehat{\varvec{\Sigma }}_{X_{j}Y_{k}} -{\varvec{\Sigma }}_{X_{j}Y_{k}}\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} / 2\right) +A\exp \left( -s n^{\alpha }\right) \end{aligned}$$

Note hat

$$\begin{aligned} {\hat{\rho }}_j={\hat{\sigma }}_j^{-2}\widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}} \left( \widehat{\varvec{\Sigma }}_{\varvec{y}}\right) ^{-1} \widehat{\varvec{\Sigma }}_{\varvec{y}X_j} \end{aligned}$$

Let \(\hat{\rho }^0_j=\widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}} \left( \widehat{\varvec{\Sigma }}_{\varvec{y}}\right) ^{-1} \widehat{\varvec{\Sigma }}_{\varvec{y}X_j}\) and \(\rho _j^0={\varvec{\Sigma }}_{X_j{\varvec{y}}} \left( {\varvec{\Sigma }}_{\varvec{y}}\right) ^{-1}{\varvec{\Sigma }}_{\varvec{y}X_j}.\) Then by some algebra, we have

$$\begin{aligned} {\hat{\rho }}_j^0-\rho _j^0=I_1+I_2+I_3, \end{aligned}$$

where

$$\begin{aligned}&I_1=\left( \widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}}-{\varvec{\Sigma }}_{X_j{\varvec{y}}}\right) ^\top \widehat{\varvec{\Sigma }}_{\varvec{y}}^{-1}\left( \widehat{\varvec{\Sigma }}_{{\varvec{y}}X_j}-{\varvec{\Sigma }}_{{\varvec{y}}X_j}\right) \\&I_2=2\left( \widehat{\varvec{\Sigma }}_{X_j{\varvec{y}}}-{\varvec{\Sigma }}_{X_j{\varvec{y}}}\right) ^\top \widehat{\varvec{\Sigma }}_{\varvec{y}}^{-1}{\varvec{\Sigma }}_{{\varvec{y}}X_j}\\&I_3={\varvec{\Sigma }}_{X_j{\varvec{y}}}\left( \widehat{\varvec{\Sigma }}_{{\varvec{y}}}-{\varvec{\Sigma }}_{{\varvec{y}}}\right) {\varvec{\Sigma }}_{{\varvec{y}}X_j} \end{aligned}$$

The next proof is the same as the that in Li et al. (2017), we omit the details. Eventually, we have

$$\begin{aligned}&P\left\{ |{\hat{\rho }}_j^0-\rho _j^0|\ge \varepsilon \right\} \nonumber \\&\quad \le O\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -8\kappa }\varepsilon ^2\right) \right. \nonumber \\&\qquad \left. +6 n^{2 \kappa } \exp \left( -C_2 n^{1-2 \alpha -4 \kappa }\right) +\left( 4 n^{2 \kappa }+2 n^{\kappa }\right) A\exp \left( -s n^{\alpha }\right) \right) \end{aligned}$$
(A.5)

where \(C_1\) and \(C_2\) are two constants.

In the following, we prove the consistency of \(\sigma _j^{-2}\). Similarly to Lemma A.5, it has that

$$\begin{aligned} P\left( \left| {\hat{\sigma }}_j^2-\sigma _j^2\right| \ge \varepsilon \right) \le 2 \exp \left( -n^{1-2 \alpha } \varepsilon ^{2} /2\right) +A \exp \left( -sn^{\alpha }\right) . \end{aligned}$$

It can be proved that

$$\begin{aligned}&P\left\{ \left| {\hat{\sigma }}_j^{-2}-\sigma _j^{-2}\right| \ge \varepsilon \right\} =P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon {\hat{\sigma }}_j^2\sigma _j^2)\\&\quad =P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon {\hat{\sigma }}_j^2\sigma _j^2,{\hat{\sigma }}_j^2\ge M)+P({\hat{\sigma }}_j^2< M)\\&\quad \le P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon M\sigma _j^2)+P({\hat{\sigma }}_j^2< M)\\&\quad \le P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon M\sigma _j^2)+P(|\sigma _j^2|-|{\hat{\sigma }}_j^2-\sigma _j^2|< M)\\&\quad =P(|{\hat{\sigma }}_j^2-\sigma _j^2|\ge \varepsilon M\sigma _j^2)+P(|{\hat{\sigma }}_j^2-\sigma _j^2| > M+\sigma _j^2)\\&\quad \le 2 \exp \left( -n^{1-2 \alpha } ( \varepsilon M\sigma _j^2)^{2} /2\right) +2 \exp \left( -n^{1-2 \alpha }(M+\sigma _j^2)^{2} /2\right) +2A \exp \left( -sn^{\alpha }\right) \end{aligned}$$

Let \(M=O(n^\xi )\), it has that

$$\begin{aligned} P\left\{ \left| {\hat{\sigma }}_j^{-2}-\sigma _j^{-2}\right| \ge \varepsilon \right\} \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi } \varepsilon ^2)\right) +2A \exp \left( -sn^{\alpha }\right) \end{aligned}$$
(A.6)

We now bound the item \(P(|{\hat{\rho }}_j-\rho _j|\ge \varepsilon )\). It has that

$$\begin{aligned}&P(|{\hat{\rho }}_j-\rho _j|\ge \varepsilon )=P(|{\hat{\rho }}_j^0{\hat{\sigma }}_j^{-2}-\rho _j^0\sigma _j^{-2}|\ge \varepsilon ) \nonumber \\&\quad \le P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|+|\sigma _j^{-2}({\hat{\rho }}_j^0-\sigma _j^{-2})|\ge \varepsilon ) \nonumber \\&\quad \le P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2)+P(|\sigma _j^{-2}({\hat{\rho }}_j^0-\sigma _j^{-2})|\ge \varepsilon /2) \end{aligned}$$
(A.7)

For the first item in A.7, we have

$$\begin{aligned}&P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2)\\&\quad =P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2, |{\hat{\rho }}_j^0|<M_1)+P(|{\hat{\rho }}_j^0|>M_1)\\&\quad \le P(|{\hat{\sigma }}_j^{-2}-\sigma _j^{-2}|>\varepsilon /(2M_1))+P(|{\hat{\rho }}_j^0-\rho _j^0|>M_1-|\rho _j^0|)\\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi } \varepsilon ^2/M_1^2)\right) +2A \exp \left( -sn^{\alpha }\right) \\&\qquad + O\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -8\kappa }(M_1-|\rho _j^0|)^2\right) \right. \\&\qquad \left. +6 n^{2 \kappa } \exp \left( -C_2 n^{1-2 \alpha -4 \kappa }\right) +\left( 4 n^{2 \kappa }+2 n^{\kappa }\right) A\exp \left( -s n^{\alpha }\right) \right) \end{aligned}$$

Let \(M_1=O(n^\gamma )\), it has that

$$\begin{aligned}&P(|{\hat{\rho }}_j^0({\hat{\sigma }}_j^{-2}-\sigma _j^{-2})|>\varepsilon /2)\\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi -2\gamma } \varepsilon ^2)\right) +\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8 \kappa }\varepsilon ^2\right) \right. \\&\qquad \left. +\left( 4 n^{2 \kappa }+2 n^{\kappa }+2\right) A\exp \left( -s n^{\alpha }\right) \right) \end{aligned}$$

Similarly, we can prove

$$\begin{aligned}&P(|\sigma _j^{-2}({\hat{\rho }}_j^0-\sigma _j^{-2})|\ge \varepsilon /2)\\&\quad \le (2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) +(4 n^{2 \kappa }+2 n^{\kappa }+2) A\exp \left( -s n^{\alpha }\right) \\&\qquad +2 \exp \left( -O(n^{1-2 \alpha -2\xi +2\gamma } \varepsilon ^2)\right) \end{aligned}$$

Consequently, it is easily obtained that

$$\begin{aligned}&P\left\{ \left| {\hat{\rho }}_j-\rho _j\right| \ge \varepsilon \right\} \\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi -2\gamma } \varepsilon ^2)\right) +\left( \left( 2 n^{2 \kappa }+4 n^{\kappa }\right) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) \right. \\&\qquad \left. +\left( 4 n^{2 \kappa }+2 n^{\kappa }\right) A\exp \left( -s n^{\alpha }\right) \right) + (2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) \\&\qquad +(4 n^{2 \kappa }+2 n^{\kappa }) A\exp \left( -s n^{\alpha }\right) +2 \exp \left( -O(n^{1-2 \alpha -2\xi +2\gamma } \varepsilon ^2)\right) \\&\qquad +2A \exp \left( -sn^{\alpha }\right) \end{aligned}$$

By some simple simplication, it has that

$$\begin{aligned}&P\left\{ \left| {\hat{\sigma }}_j^{-2}-\sigma _j^{-2}\right| \ge \varepsilon \right\} \\&\quad \le 2 \exp \left( -O(n^{1-2 \alpha -2\xi -2\gamma } \varepsilon ^2)\right) \\&\qquad \left. +\left( 8 n^{2 \kappa }+4 n^{\kappa }+2\right) A\exp \left( -s n^{\alpha }\right) \right) \\&\qquad + (2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( -C_1 n^{1-2 \alpha -2\gamma -8\kappa }\varepsilon ^2\right) \end{aligned}$$

By setting \(\varepsilon =cn^{-\eta }\), it is easily obtained that

$$\begin{aligned}&P\left\{ \max _{j\in \widehat{{\mathcal {A}}}^\star }\left| {\hat{\rho }}_j-\rho _j\right| \ge cn^{-\eta } \right\} \\&\quad \le s_A\left\{ 2\exp \left( -O(c_\rho ^2n^{1-2 \alpha -2\xi -2\gamma -2\eta })\right) \right. \\&\qquad +(2 n^{2 \kappa }+4 n^{\kappa }) \exp \left( - O(c_\rho ^2n^{1-2 \alpha -2\gamma -2\eta -8\kappa })\right) \\&\qquad \left. +\left( 8 n^{2 \kappa }+4 n^{\kappa }+4\right) A\exp \left( -s n^{\alpha }\right) \right\} , \end{aligned}$$

where \(s_A\) is the size of \(\widehat{{\mathcal {A}}}^\star \). \(\square \)

Proof of Theorem 3.3

By the Bayesian formula. It holds that

$$\begin{aligned} P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\right\} =P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\mid {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} P\left\{ {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} . \end{aligned}$$

On the one hand, \(P\left\{ {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \) has been bounded in Theorem 3.1, on the other hand, \(P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\mid {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \) can also be bounded like \(P\left\{ {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \) by using the same technique, namely, \(P\left\{ {\mathcal {M}}\subset \widehat{{\mathcal {M}}}\mid {\mathcal {A}}\subset \widehat{{\mathcal {A}}}\right\} \ge 1-s_{2n}P\left\{ \left| {\hat{\rho }}_j-\rho _j\right| \ge c_\rho n^{-\gamma } \right\} \). The proof are very similar to the proof of Theorem 3.1, and thus is omitted for simplicity. The sure screening of \(\widehat{{\mathcal {I}}}\) can be proved in a similar spirit. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, J., Wang, D. & Hu, Q. Interaction screening via canonical correlation. Comput Stat 37, 2637–2670 (2022). https://doi.org/10.1007/s00180-022-01206-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-022-01206-7

Keywords

Navigation