Abstract
We present a new algorithm for independent component analysis which has provable performance guarantees. In particular, suppose we are given samples of the form \(y = Ax + \eta \) where \(A\) is an unknown but non-singular \(n \times n\) matrix, \(x\) is a random variable whose coordinates are independent and have a fourth order moment strictly less than that of a standard Gaussian random variable and \(\eta \) is an \(n\)-dimensional Gaussian random variable with unknown covariance \(\varSigma \): We give an algorithm that provably recovers \(A\) and \(\varSigma \) up to an additive \(\epsilon \) and whose running time and sample complexity are polynomial in \(n\) and \(1 / \epsilon \). To accomplish this, we introduce a novel “quasi-whitening” step that may be useful in other applications where there is additive Gaussian noise whose covariance is unknown. We also give a general framework for finding all local optima of a function (given an oracle for approximately finding just one) and this is a crucial step in our algorithm, one that has been overlooked in previous attempts, and allows us to control the accumulation of error when we find the columns of \(A\) one by one via local search.
Similar content being viewed by others
Notes
Technically, there are \(2n\) local maxima since for each direction \(u\) that is a local maxima, so too is \(-u\) but this is an unimportant detail for our purposes.
References
Anandkumar, A., Foster, D., Hsu, D., Kakade, S., Liu, Y.: Two SVDs suffice: spectral decompositions for probabilistic topic modeling and latent dirichlet allocation. arxiv:abs/1203.0697 (2012)
Anderson, J., Belkin, M., Goyal, N., Rademacher, L., Voss, J.: The more the merrier: the blessing of dimensionality for learning large gaussian mixtures. arxiv:1311.2891 (2013)
Arora, S., Kannan, R.: Learning mixtures of separated nonspherical gaussians. Ann. Appl. Probab. 15(1A), 69–92 (2005)
Belkin, M., Rademacher, L., Voss, J.: Blind signal separation in the presence of gaussian noise. In: COLT (2013)
Belkin, M., Sinha, K.: Polynomial learning of distribution families. In: FOCS pp. 103–112 (2010)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Bhaskara, A., Charikar, M., Moitra, A., Vijayaraghavan, A.: Smoothed analysis of tensor decompositions. arxiv:1311.3651 (2013)
Bhaskara, A., Charikar, M., Vijayaraghavan, A.: Uniqueness of tensor decompositions with applications to polynomial identifiability. arxiv:1304.8087 (2013)
Comon, P.: Independent component analysis: a new concept? Signal Process. 36(3), 287–314 (1994)
Cruces, S., Castedo, L., Cichocki, A.: Robust blind source separation algorithms using cumulants. Neurocomputing 49(14), 87–118 (2002)
Dasgupta, S.: Learning mixtures of Gaussians. In: FOCS pp. 634–644 (1999)
De Lathauwer, L., De Moor, B., Vandewalle, J.: Independent component analysis based on higher-order statistics only. In: Proceedings of 8th IEEE Signal Processing Workshop on Statistical Signal and Array Processing (1996)
De Lathauwer, L., Castaing, J., Cardoso, J.: Fourth-order cumulant-based blind identification of underdetermined mixtures. IEEE Trans. Signal Process. 55(6), 2965–2973 (2007)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM Algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977)
Freund, Y., Haussler, D.: Unsupervised Learning of Distributions on Binary Vectors Using Two Layer Networks. University of California at Santa Cruz, Santa Cruz (1994)
Frieze, A., Jerrum, M., Kannan, R.: Learning linear transformations. In: FOCS, pp. 359–368 (1996)
Goyal, N., Vempala, S., Xiao, Y.: Fourier PCA. arxiv:1306.5825 (2013)
Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Hinton, G.E.: A practical guide to training restricted boltzmann machines. UTML TR 2010-003, Department of Computer Science, University of Toronto (2010)
Hsu, D., Kakade, S.: Learning mixtures of spherical Gaussians: moment methods and spectral decompositions. arxiv:abs/1206.5766 (2012)
Huber, P.J.: Projection pursuit. Ann. Stat. 13(2), 435–475 (1985)
Hyvarinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13(4–5), 411–430 (2000)
Hyvarinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley, New York (2001)
Kalai, A.T., Moitra, A., Valiant, G.: Efficiently learning mixtures of two Gaussians. In: STOC pp. 553–562 (2010)
Kendall, M., Stuart, A.: The Advanced Theory of Statistics. Charles Griffin and Company, London (1958)
Moitra, A., Valiant, G.: Setting the polynomial learnability of mixtures of Gaussians. In: FOCS pp. 93–102 (2010)
Vempala, S., Xiao, Y.: Structure from local optima: learning subspace juntas via higher order PCA. arxiv:abs/1108.3329 (2011)
Yeredor, A.: Blind source separation via the second characteristic function. Signal Process. 80(5), 897–902 (2000)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Arora, S., Ge, R., Moitra, A. et al. Provable ICA with Unknown Gaussian Noise, and Implications for Gaussian Mixtures and Autoencoders. Algorithmica 72, 215–236 (2015). https://doi.org/10.1007/s00453-015-9972-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-015-9972-2