[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Stable correlation and robust feature screening

  • Articles
  • Published:
Science China Mathematics Aims and scope Submit manuscript

Abstract

In this paper, we propose a new correlation, called stable correlation, to measure the dependence between two random vectors. The new correlation is well defined without the moment condition and is zero if and only if the two random vectors are independent. We also study its other theoretical properties. Based on the new correlation, we further propose a robust model-free feature screening procedure for ultrahigh dimensional data and establish its sure screening property and rank consistency property without imposing the subexponential or sub-Gaussian tail condition, which is commonly required in the literature of feature screening. We also examine the finite sample performance of the proposed robust feature screening procedure via Monte Carlo simulation studies and illustrate the proposed procedure by a real data example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Blum J R, Kiefer J, Rosenblatt M. Distribution free tests of independence based on the sample distribution function. Ann Math Statist, 1961, 32: 485–498

    Article  MathSciNet  Google Scholar 

  2. Fan J Q, Lv J C. Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B Stat Methodol, 2008, 70: 849–911

    Article  MathSciNet  Google Scholar 

  3. Gretton A, Fukumizu K, Teo C H, et al. A kernel statistical test of independence. In: Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems. Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2008, 585–592

    Google Scholar 

  4. Hall P, Miller H. Using generalized correlation to effect variable selection in very high dimensional problems. J Comput Graph Statist, 2009, 18: 533–550

    Article  MathSciNet  Google Scholar 

  5. Heller R, Heller Y, Gorfine M. A consistent multivariate test of association based on ranks of distances. Biometrika, 2013, 100: 503–510

    Article  MathSciNet  Google Scholar 

  6. Hoeffding W. A non-parametric test of independence. Ann Math Statist, 1948, 19: 546–557

    Article  MathSciNet  Google Scholar 

  7. Huo X M, Székely G J. Fast computing for distance covariance. Technometrics, 2016, 58: 435–447

    Article  MathSciNet  Google Scholar 

  8. Kendall M G. A new measure of rank correlation. Biometrika, 1938, 30: 81–93

    Article  Google Scholar 

  9. Kim I, Balakrishnan S, Wasserman L. Robust multivariate nonparametric tests via projection-averaging. Ann Statist, 2020, in press

  10. Li G R, Peng H, Zhang J, et al. Robust rank correlation based screening. Ann Statist, 2012, 40: 1846–1877

    MathSciNet  MATH  Google Scholar 

  11. Li R Z, Zhong W, Zhu L P. Feature screening via distance correlation learning. J Amer Statist Assoc, 2012, 107: 1129–1139

    Article  MathSciNet  Google Scholar 

  12. Liu W J, Ke Y, Li R Z. Model-free feature screening and FDR control with Knockoff features. J Amer Statist Assoc, 2020, in press

  13. Liu W J, Li R Z. Variable Selection and Feature Screening. Macroeconomic Forecasting in the Era of Big Data, vol. 52. Cham: Springer, 2020

    Google Scholar 

  14. Nolan J P. Multivariate elliptically contoured stable distributions: Theory and estimation. Comput Statist, 2013, 28: 2067–2089

    Article  MathSciNet  Google Scholar 

  15. Pan W L, Wang X Q, Zhang H P, et al. Ball covariance: A generic measure of dependence in Banach space. J Amer Statist Assoc, 2020, 115: 307–317

    Article  MathSciNet  Google Scholar 

  16. Sejdinovic D, Sriperumbudur B, Gretton A, et al. Equivalence of distance-based and RKHS-based statistics in hypothesis testing. Ann Statist, 2013, 41: 2263–2291

    Article  MathSciNet  Google Scholar 

  17. Serfling R J. Approximation Theorems of Mathematical Statistics. New York: Wiley, 1980

    Book  Google Scholar 

  18. Spearman C. The proof and measurement of association between two things. Amer J Psych, 1904, 15: 72–101

    Article  Google Scholar 

  19. Székely G J, Rizzo M L. Partial distance correlation with methods for dissimilarities. Ann Statist, 2014, 42: 2382–2412

    Article  MathSciNet  Google Scholar 

  20. Székely G J, Rizzo M L, Bakirov N K. Measuring and testing dependence by correlation of distances. Ann Statist, 2007, 35: 2769–2794

    Article  MathSciNet  Google Scholar 

  21. Weihs L, Drton M, Meinshausen N. Symmetric rank covariances: A generalized framework for nonparametric measures of dependence. Biometrika, 2018, 105: 547–562

    Article  MathSciNet  Google Scholar 

  22. Zhong W, Zhu L P, Li R Z, et al. Regularized quantile regression and robust feature screening for single index models. Statist Sinica, 2016, 26: 69–95

    MathSciNet  MATH  Google Scholar 

  23. Zhu L P, Li L X, Li R Z, et al. Model-free feature screening for ultrahigh-dimensional data. J Amer Statist Assoc, 2011, 106: 1464–1475

    Article  MathSciNet  Google Scholar 

  24. Zhu L P, Xu K, Li R Z, et al. Projection correlation between two random vectors. Biometrika, 2017, 104: 829–843

    Article  MathSciNet  Google Scholar 

Download references

Acknowldgements

The first author was supported by National Natural Science Foundation of China (Grant No. 11701034). The second author was supported by National Science Foundation of USA (Grant No. DMS-1820702). The authors are grateful to the two anonymous referees for the constructive comments and suggestions that led to significant improvement of an early manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xu Guo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, X., Li, R., Liu, W. et al. Stable correlation and robust feature screening. Sci. China Math. 65, 153–168 (2022). https://doi.org/10.1007/s11425-019-1702-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11425-019-1702-5

Keywords

MSC(2020)

Navigation