Lingam: Non-Gaussian Methods for Estimating Causal Structures

Shohei Shimizu¹

813 Accesses
44 Citations
2 Altmetric
Explore all metrics

Abstract

In many empirical sciences, the causal mechanisms underlying various phenomena need to be studied. Structural equation modeling is a general framework used for multivariate analysis, and provides a powerful method for studying causal mechanisms. However, in many cases, classical structural equation modeling is not capable of estimating the causal directions of variables. This is because it explicitly or implicitly assumes Gaussianity of data and typically utilizes only the covariance structure of data. In many applications, however, non-Gaussian data are often obtained, which means that more information may be contained in the data distribution than the covariance matrix is capable of containing. Thus, many new methods have recently been proposed for utilizing the non-Gaussian structure of data and estimating the causal directions of variables. In this paper, we provide an overview of such recent developments in causal inference, and focus in particular on the non-Gaussian methods known as LiNGAM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

References

Amari, S. (1998). Natural gradient learning works efficiently in learning. Neural Computation, 10:251–276.
Article Google Scholar
Bach, F. R. and Jordan, M. I. (2002). Kernel independent component analysis. Journal of Machine Learning Research, 3:1–48.
MathSciNet MATH Google Scholar
Bentler, P. M. (1983). Some contributions to efficient statistics in structural models: Specification and estimation of moment structures. Psychometrika, 48:493–517.
Article MathSciNet MATH Google Scholar
Bollen, K. (1989). Structural Equations with Latent Variables. John Wiley & Sons.
Book MATH Google Scholar
Bühlmann, P. (2013). Causal statistical inference in high dimensions. Mathematical Methods of Operations Research, 77(3):3–370.
Article MathSciNet MATH Google Scholar
Bühlmann, P., Peters, J., and Ernest, J. (2013). CAM: Causal additive models, high-dimensional order search and penalized regression. arXiv:1310.1533.
Google Scholar
Cai, R., Zhang, Z., and Hao, Z. (2013). SADA: A general framework to support robust causation discovery. In Proc. 30th International Conference on Machine Learning (ICML2013), pages 208–216.
Google Scholar
Chen, Z. and Chan, L. (2013). Causality in linear nonGaussian acyclic models in the presence of latent Gaussian confounders. Neural Computation, 25(6):6–1641.
Article MATH Google Scholar
Chickering, D. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3:507–554.
MathSciNet MATH Google Scholar
Comon, P. (1994). Independent component analysis, a new concept? Signal Processing, 36:62–83.
Article MATH Google Scholar
Darmois, G. (1953). Analyse g’en’erale des liaisons stochastiques. Review of the International Statistical Institute, 21:2–8.
Article MathSciNet MATH Google Scholar
Dodge, Y. and Rousson, V. (2001). On asymmetric properties of the correlation coefficient in the regression setting. The American Statistician, 55(1):1–54.
Article MathSciNet Google Scholar
Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman & Hall, New York.
Book MATH Google Scholar
Entner, D. and Hoyer, P. (2010). On causal discovery from time series data using FCI. In Proc. 5th European Workshop on Probabilistic Graphical Models (PGM2010).
Google Scholar
Entner, D. and Hoyer, P. O. (2011). Discovering unconfounded causal relationships using linear non-Gaussian models. In New Frontiers in Artificial Intelligence, Lecture Notes in Computer Science, volume 6797, pages 181–195.
Article Google Scholar
Entner, D. and Hoyer, P. O. (2012). Estimating a causal order among groups of variables in linear models. In Proc. 22nd International Conference on Artificial Neural Networks (ICANN2012), pages 83–90.
Google Scholar
Eriksson, J. and Koivunen, V. (2004). Identifiability, separability, and uniqueness of linear ICA models. IEEE Signal Processing Letters, 11:601–604.
Article Google Scholar
Ferkingsta, E., Lølanda, A., and Wilhelmsen, M. (2011). Causal modeling and inference for electricity markets. Energy Economics, 33(3):3–412.
Google Scholar
Gao, W. and Yang, H. (2012). Identifying structural VAR model with latent variables using overcomplete ICA. Far East Journal of Theoretical Statistics, 40(1):1–44.
MathSciNet MATH Google Scholar
Glymour, C. (2010). What is right with ‘Bayes net methods’ and what is wrong with ‘hunting causes and using them’? The British Journal for the Philosophy of Science, 61(1):1–211.
Article MathSciNet Google Scholar
Gretton, A., Bousquet, O., Smola, A. J., and Schölkopf, B. (2005). Measuring statistical dependence with Hilbert-Schmidt norms. In Proc. 16th International Conference on Algorithmic Learning Theory (ALT2005), pages 63–77.
Chapter Google Scholar
Henao, R. and Winther, O. (2011). Sparse linear identifiable multivariate modeling. Journal of Machine Learning Research, 12:863–905.
MathSciNet MATH Google Scholar
Himberg, J., Hyvärinen, A., and Esposito, F. (2004). Validating the independent components of neuroimaging time-series via clustering and visualization. NeuroImage, 22:1214–1222.
Article Google Scholar
Hirayama, J. and Hyvärinen, A. (2011). Structural equations and divisive normalization for energy-dependent component analysis. In Advances in Neural Information Processing Systems 23, pages 1872–1880.
Google Scholar
Holland, P. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81:945–970.
Article MathSciNet MATH Google Scholar
Hoyer, P. O. and Hyttinen, A. (2009). Bayesian discovery of linear acyclic causal models. In Proc. 25th Conference on Uncertainty in Artificial Intelligence (UAI2009), pages 240–248.
Google Scholar
Hoyer, P. O., Hyvärinen, A., Scheines, R., Spirtes, P., Ramsey, J., Lacerda, G., and Shimizu, S. (2008a). Causal discovery of linear acyclic models with arbitrary distributions. In Proc. 24th Conference on Uncertainty in Artificial Intelligence (UAI2008), pages 282–289.
Google Scholar
Hoyer, P. O., Janzing, D., Mooij, J., Peters, J., and Schölkopf, B. (2009). Nonlinear causal discovery with additive noise models. In Koller, D., Schuurmans, D., Bengio, Y., and Bottou, L., editors, Advances in Neural Information Processing Systems 21, pages 689–696.
Google Scholar
Hoyer, P. O., Shimizu, S., Kerminen, A., and Palviainen, M. (2008b). Estimation of causal effects using linear non-Gaussian causal models with hidden variables. International Journal of Approximate Reasoning, 49(2):2–378.
Article MathSciNet MATH Google Scholar
Hurley, D., Araki, H., Tamada, Y., Dunmore, B., Sanders, D., Humphreys, S., Affara, M., Imoto, S., Yasuda, K., Tomiyasu, Y., et al. (2012). Gene network inference and visualization tools for biologists: Application to new human transcriptome datasets. Nucleic Acids Research, 40(6):6–2398.
Article Google Scholar
Hyvärinen, A. (1998). New approximations of differential entropy for independent component analysis and projection pursuit. In Advances in Neural Information Processing Systems 10, pages 273–279.
Google Scholar
Hyvärinen, A. (1999). Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks, 10:626–634.
Article Google Scholar
Hyvärinen, A. (2013). Independent component analysis: Recent advances. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371:20110534.
Article MathSciNet MATH Google Scholar
Hyvärinen, A., Karhunen, J., and Oja, E. (2001). Independent component analysis. Wiley, New York.
Book Google Scholar
Hyvärinen, A. and Smith, S. M. (2013). Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. Journal of Machine Learning Research, 14:111–152.
MathSciNet MATH Google Scholar
Hyvärinen, A., Zhang, K., Shimizu, S., and Hoyer, P. O. (2010). Estimation of a structural vector autoregressive model using non-Gaussianity. Journal of Machine Learning Research, 11:1709–1731.
MATH Google Scholar
Imoto, S., Kim, S., Goto, T., Aburatani, S., Tashiro, K., Kuhara, S., and Miyano, S. (2002). Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network. In Proc. 1st IEEE Computer Society Bioinformatics Conference, pages 219–227.
Chapter Google Scholar
Jutten, C. and H’erault, J. (1991). Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture. Signal Processing, 24(1):1–10.
Article MATH Google Scholar
Kadowaki, K., Shimizu, S., and Washio, T. (2013). Estimation of causal structures in longitudinal data using non-Gaussianity. In Proc. 23rd IEEE International Workshop on Machine Learning for Signal Processing (MLSP2013). In press.
Google Scholar
Kawahara, Y., Bollen, K., Shimizu, S., and Washio, T. (2010). GroupLiNGAM: Linear non-Gaussian acyclic models for sets of variables. arXiv:1006.5041.
Google Scholar
Kawahara, Y., Shimizu, S., and Washio, T. (2011). Analyzing relationships among ARMA processes based on non-Gaussianity of external influences. Neurocomputing, 4(12-13):2212–2221.
Article Google Scholar
Komatsu, Y., Shimizu, S., and Shimodaira, H. (2010). Assessing statistical reliability of LiNGAM via multiscale bootstrap. In Proc. 20th International Conference on Artificial Neural Networks (ICANN2010), pages 309–314.
Google Scholar
Kraskov, A., Stögbauer, H., and Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69(6):066138.
Article MathSciNet Google Scholar
Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2:83–97.
Article MathSciNet MATH Google Scholar
Lacerda, G., Spirtes, P., Ramsey, J., and Hoyer, P. O. (2008). Discovering cyclic causal models by independent components analysis. In Proc. 24th Conference on Uncertainty in Artificial Intelligence (UAI2008), pages 366–374.
Google Scholar
Lewicki, M. and Sejnowski, T. J. (2000). Learning overcomplete representations. Neural Computation, 12(2):2–365.
Article Google Scholar
Maathuis, M., Colombo, D., Kalisch, M., and Bühlmann, P. (2010). Predicting causal effects in large-scale systems from observational data. Nature Methods, 7(4):4–248.
Article Google Scholar
Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105(1):1–166.
Article Google Scholar
Moneta, A., Entner, D., Hoyer, P., and Coad, A. (2013). Causal inference by independent component analysis: Theory and applications. Oxford Bulletin of Economics and Statistics, 75:705–730.
Article Google Scholar
Mooij, J., Janzing, D., Heskes, T., and Schölkopf, B. (2011). Causal discovery with cyclic additive noise models. In Advances in Neural Information Processing Systems 24, pages 639–647.
Google Scholar
Mooij, J., Janzing, D., Peters, J., and Schölkopf, B. (2009). Regression by dependence minimization and its application to causal inference in additive noise models. In Proc. 26th International Conference on Machine Learning (ICML2009), pages 745–752. Omnipress.
Google Scholar
Neyman, J. (1923). Sur les applications de la thar des probabilities aux experiences Agaricales: Essay des principle.
Google Scholar
Niyogi, D., Kishtawal, C., Tripathi, S., and Govindaraju, R. S. (2010). Observational evidence that agricultural intensification and land use change may be reducing the Indian summer monsoon rainfall. Water Resources Research, 46:W03533.
Article Google Scholar
Ozaki, K. and Ando, J. (2009). Direction of causation between shared and non-shared environmental factors. Behavior Genetics, 39(3):3–336.
Article Google Scholar
Ozaki, K., Toyoda, H., Iwama, N., Kubo, S., and Ando, J. (2011). Using non-normal SEM to resolve the ACDE model in the classical twin design. Behavior Genetics, 41(2):2–339.
Article Google Scholar
Pearl, J. (1995). Causal diagrams for empirical research. Biometrika, 82(4):4–688.
Article MathSciNet MATH Google Scholar
Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge University Press. (2nd ed. 2009).
MATH Google Scholar
Pearl, J. and Verma, T. (1991). A theory of inferred causation. In Allen, J., Fikes, R., and Sandewall., E., editors, Proc. 2nd International Conference on Principles of Knowledge Representation and Reasoning, pages 441–452. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Pe’er, D. and Hacohen, N. (2011). Principles and strategies for developing network models in cancer. Cell, 144:864–873.
Article Google Scholar
Peters, J., Janzing, D., and Schölkopf, B. (2011a). Causal inference on discrete data using additive noise models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12):12–2450.
Article Google Scholar
Peters, J., Janzing, D., and Schölkopf, B. (2013). Causal inference on time series using restricted structural equation models. In Advances in Neural Information Processing Systems 26.
Google Scholar
Peters, J., Mooij, J., Janzing, D., and Schölkopf, B. (2011b). Identifiability of causal graphs using functional models. Proc. 27th Conference on Uncertainty in Artificial Intelligence (UAI2011), pages 589–598.
Google Scholar
Ramsey, J., Hanson, S., and Glymour, C. (2011). Multi-subject search correctly identifies causal connections and most causal directions in the DCM models of the Smith et al. simulation study. NeuroImage, 58(3):3–848.
Article Google Scholar
Richardson, T. (1996). A polynomial-time algorithm for deciding Markov equivalence of directed cyclic graphical models. In Proc. 12th Conference on Uncertainty in Artificial Intelligence (UAI1996), pages 462–469.
Google Scholar
Rosenström, T., Jokela, M., Puttonen, S., Hintsanen, M., Pulkki-Råback, L., Viikari, J. S., Raitakari, O. T., and Keltikangas-Järvinen, L. (2012). Pairwise measures of causal direction in the epidemiology of sleep problems and depression. PloS ONE, 7(11):e50841.
Article Google Scholar
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66:688–701.
Article Google Scholar
Schaechtle, U., Stathis, K., Holloway, R., and Bromuri, S. (2013). Multi-dimensional causal discovery. In Proc. 23rd International Joint Conference on Artificial Intelligence (IJCAI2013), pages 1649–1655.
Google Scholar
Schölkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., and Mooij, J. (2012). On causal and anticausal learning. In Proc. 29th International Conference on Machine learning (ICML2012), pages 1255–1262.
Google Scholar
Shimizu, S. (2012). Joint estimation of linear non-Gaussian acyclic models. Neurocomputing, 81:104–107.
Article Google Scholar
Shimizu, S. and Bollen, K. (2013). Bayesian estimation of possible causal direction in the presence of latent confounders using a linear non-Gaussian acyclic structural equation model with individual-specific effects. arXiv:1310.6778.
Google Scholar
Shimizu, S., Hoyer, P. O., and Hyvärinen, A. (2009). Estimation of linear non-Gaussian acyclic models for latent factors. Neurocomputing, 72:2024–2027.
Article Google Scholar
Shimizu, S., Hoyer, P. O., Hyvärinen, A., and Kerminen, A. (2006). A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7:2003–2030.
MathSciNet MATH Google Scholar
Shimizu, S. and Hyvarinen, A. (2008). Discovery of linear non-Gaussian acyclic models in the presence of latent classes. In Proc. 14th International Conference on Neural Information Processing (ICONIP2007), pages 752–761.
Chapter Google Scholar
Shimizu, S., Inazumi, T., Sogawa, Y., Hyvarinen, A., Kawahara, Y., Washio, T., Hoyer, P. O., and Bollen, K. (2011). DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model. Journal of Machine Learning Research, 12:1225–1248.
MathSciNet MATH Google Scholar
Shimizu, S. and Kano, Y. (2008). Use of non-normality in structural equation modeling: Application to direction of causation. Journal of Statistical Planning and Inference, 138:3483–3491.
Article MathSciNet MATH Google Scholar
Shpitser, I. and Pearl, J. (2006). Identification of joint interventional distributions in recursive semi-Markovian causal models. In Proc. 22nd Conference on Uncertainty in Artificial Intelligence (UAI2006), pages 437–444.
Google Scholar
Shpitser, I. and Pearl, J. (2008). Complete identification methods for the causal hierarchy. Journal of Machine Learning Research, 9:1941–1979.
MathSciNet MATH Google Scholar
Skitovitch, W. P. (1953). On a property of the normal distribution. Doklady Akademii Nauk SSSR, 89:217–219.
MathSciNet Google Scholar
Smith, S. (2012). The future of FMRI connectivity. NeuroImage, 62(2):2–1266.
Article Google Scholar
Smith, S., Miller, K., Salimi-Khorshidi, G., Webster, M., Beckmann, C., Nichols, T., Ramsey, J., and Woolrich, M. (2011). Network modelling methods for FMRI. NeuroImage, 54(2):2–891.
Article Google Scholar
Sogawa, Y., Shimizu, S., Shimamura, T., Hyvärinen, A., Washio, T., and Imoto, S. (2011). Estimating exogenous variables in data with more variables than observations. Neural Networks, 24(8):8–880.
Article MATH Google Scholar
Spirtes, P. and Glymour, C. (1991). An algorithm for fast recovery of sparse causal graphs. Social Science Computer Review, 9:67–72.
Article Google Scholar
Spirtes, P., Glymour, C., and Scheines, R. (1993). Causation, Prediction, and Search. Springer Verlag. (2nd ed. MIT Press, 2000).
Book MATH Google Scholar
Spirtes, P., Meek, C., and Richardson, T. (1995). Causal inference in the presence of latent variables and selection bias. In Proc. 11th Annual Conference on Uncertainty in Artificial Intelligence (UAI1995), pages 491–506.
Google Scholar
Statnikov, A., Henaff, M., Lytkin, N. I., and Aliferis, C. F. (2012). New methods for separating causes from effects in genomics data. BMC Genomics, 13(Suppl 8):S22.
Article Google Scholar
Swanson, N. and Granger, C. (1997). Impulse response functions based on a causal approach to residual orthogonalization in vector autoregressions. Journal of the American Statistical Association, pages 357–367.
Google Scholar
Takahashi, Y., Ozaki, K., Roberts, B., and Ando, J. (2012). Can low behavioral activation system predict depressive mood?: An application of non-normal structural equation modeling. Japanese Psychological Research, 54(2):2–181.
Article Google Scholar
Tashiro, T., Shimizu, S., Hyvärinen, A., and Washio, T. (2014). ParceLiNGAM: A causal ordering method robust against latent confounders. Neural Computation.
Google Scholar
Thamvitayakul, K., Shimizu, S., Ueno, T., Washio, T., and Tashiro, T. (2012). Bootstrap confidence intervals in DirectLiNGAM. In Proc. 2012 IEEE 12th International Conference on Data Mining Workshops (ICDMW2012), pages 659–668. IEEE.
Chapter Google Scholar
Tillman, R. E., Gretton, A., and Spirtes, P. (2010). Nonlinear directed acyclic structure learning with weakly additive noise models. In Advances in Neural Information Processing Systems 22, pages 1847–1855.
Google Scholar
Tillman, R. E. and Spirtes, P. (2011). When causality matters for prediction: Investigating the practical tradeoffs. In JMLR Workshop and Conference Proceedings, Causality: Objectives and Assessment (Proc. NIPS2008 Workshop on Causality), volume 6, pages 373–382.
Google Scholar
Wright, S. (1921). Correlation and causation. Journal of Agricultural Research, 20:557–585.
Google Scholar
Zhang, K. and Chan, L.-W. (2006). ICA with sparse connections. In Proc. 7th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2006), pages 530–537.
Chapter Google Scholar
Zhang, K. and Hyvärinen, A. (2009a). Causality discovery with additive disturbances: An information-theoretical perspective. In Proc. European Conference on Machine Learning (ECML2009), pages 570–585.
Google Scholar
Zhang, K. and Hyvärinen, A. (2009b). On the identifiability of the post-nonlinear causal model. In Proc. 25th Conference in Uncertainty in Artificial Intelligence (UAI2009), pages 647–655.
Google Scholar
Zhang, K., Schölkopf, B., and Janzing, D. (2010). Invariant Gaussian process latent variable models and application in causal discovery. In Proc. 26nd Conference on Uncertainty in Artificial Intelligence (UAI2010), pages 717–724.
Google Scholar
Zou, H. (2006). The adaptive Lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

The Institute of Scientific and Industrial Research, Osaka University, Mihogaoka 8-1, Ibaraki, Osaka, 567-0047, Japan
Shohei Shimizu

Authors

Shohei Shimizu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shohei Shimizu.

About this article

Cite this article

Shimizu, S. Lingam: Non-Gaussian Methods for Estimating Causal Structures. Behaviormetrika 41, 65–98 (2014). https://doi.org/10.2333/bhmk.41.65

Download citation

Received: 02 August 2013
Revised: 01 October 2013
Published: 01 January 2014
Issue Date: January 2014
DOI: https://doi.org/10.2333/bhmk.41.65

Lingam: Non-Gaussian Methods for Estimating Causal Structures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Causal Discovery with Hidden Variables Based on Non-Gaussianity and Nonlinearity

Third moment-based causal inference

An estimation of causal structure based on Latent LiNGAM for mixed data

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Key Words and Phrases

Subscribe and save

Buy Now

Navigation

Lingam: Non-Gaussian Methods for Estimating Causal Structures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Causal Discovery with Hidden Variables Based on Non-Gaussianity and Nonlinearity

Third moment-based causal inference

An estimation of causal structure based on Latent LiNGAM for mixed data

References

Author information

Authors and Affiliations

Corresponding author

About this article

Cite this article

Share this article

Key Words and Phrases

Subscribe and save

Buy Now

Search

Navigation