Abstract
This article provides an introductory overview of the state of research on Hierarchical Bayesian Modeling in cognitive development. First, a brief historical summary and a definition of hierarchies in Bayesian modeling are given. Subsequently, some model structures are described based on four examples in the literature. These are models for the development of the shape bias, for learning ontological kinds and causal schemata as well as for the categorization of objects. The Bayesian modeling approach is then compared with the connectionist and nativist modeling paradigms and considered in view of Marr’s (1982) three description levels of information-processing mechanisms. In this context, psychologically plausible algorithms and ideas of their neural implementation are presented. In addition to criticism and limitations of the approach, research needs are identified.
Similar content being viewed by others
References
Abdelbar AM, Hedetniemi SM (1998) Approximating MAPs for belief networks is NP-hard and other theorems. Artif Intell 102:21–38
Aldous DJ (1985) Exchangeability and related topics. In: Hennequin P (ed) École d’Été de Probabilités de Saint-Flour XIII – 1983. Springer, Berlin, pp 1–198
Anderson JR (1990) The adaptive character of thought. Lawrence Erlbaum Associates Inc, Hillsdale
Anderson JR (1991) The adaptive nature of human categorization. Psychol Rev 98:409–429
Anderson JR (2007) How can the human mind occur in the physical universe?. Oxford University Press, New York
Anderson JR, Milson R (1989) Human memory: an adaptive perspective. Psychol Rev 96:703–719
Bar-Eli M, Azar OH, Ritov I, Keidar-Levin Y, Schein G (2007) Action bias among elite soccer goalkeepers: the case of penalty kicks. J Econ Psychol 28:606–621
Bonawitz E, Denison S, Griffiths TL, Gopnik A (2014) Probabilistic models, learning algorithms, and response variability: sampling in cognitive development. Trends Cogn Sci 18:497–500
Bowers JS, Davis CJ (2012) Bayesian just-so stories in psychology and neuroscience. Psychol Bull 138:389–414
Chater N, Oaksford M (1999) Ten years of the rational analysis of cognition. Trends Cogn Sci 3:57–65
Cohen H, Lefebvre C (2005) Handbook of categorization in cognitive science. Elsevier, Amsterdam [etc.]
Cooper GF (1990) The computational complexity of probabilistic inference using Bayesian belief networks. Artif Intell 42:393–405
Dagum P, Luby M (1993) Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artif Intell 60:141–153
Danks D, Griffiths TL, Tenenbaum JB (2003) Dynamical causal learning. In: Becker S, Thrun S, Obermayer K (eds) Advances in neural information processing systems. MIT Press, Cambridge, pp 67–74
David HA (1998) First (?) occurrence of common terms in probability and statistics–a second list, with corrections. Am Stat 52:36–40
Daw ND, Courville AC, Dayan P (2008) Semi-rational models of conditioning: the case of trial order. In: Chater Nick, Oaksford Mike (eds) The probabilistic mind. Prospects for Bayesian cognitive science. Oxford University Press, Oxford
Doucet A, de Freitas N, Gordon N (2001) Sequential Monte Carlo methods in practice. Springer, New York [etc.]
Draper D (1995) Inference and hierarchical modeling in the social sciences. J Educ Behav Stat 20:115–147
Ellsberg D (1961) Risk, ambiguity, and the savage axioms. Q J Econ 75:643–669
Endress AD (2013) Bayesian learning and the psychology of rule induction. Cognition 127:159–176
Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1:209–230
Friston K (2010) The free-energy principle: a unified brain theory? Nat Rev Neurosci 11:127–138
Geisler WS (2003) Ideal observer analysis. In: Chalupa LM, Werner JS (eds) The visual neurosciences. MIT Press, Cambridge, pp 825–837
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. CRC Press, Boca Raton
Gelman A, Lee D, Guo J (2015) Stan: a probabilistic programming language for Bayesian inference and optimization. J Educ Behav Stat 40:530–543
Geman S, Geman D (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. Pattern analysis and machine intelligence. IEEE Trans PAMI 6:721–741
Gershman SJ, Blei DM (2012) A tutorial on Bayesian nonparametric models. J Math Psychol 56:1–12
Gershman SJ, Daw ND (2012) Perception, action and utility: the tangled skein. In: Rabinovich MI, Friston KJ, Varona P (eds) Principles of brain dynamics. Global state interactions. MIT Press, Cambridge, pp 293–312
Gigerenzer G, Hoffrage U, Goldstein DG (2008) Fast and frugal heuristics are plausible models of cognition: reply to Dougherty, Franco-Watkins, and Thomas (2008). Psychol Rev 115:230–239
Good IJ (1980) Some history of the hierarchical Bayesian methodology. Trabajos de Estadistica Y de Investigacion Operativa 31:489–519
Goodman ND, Ullman TD, Tenenbaum JB (2011) Learning a theory of causality. Psychol Rev 118:110–119
Gopnik A (2008) The theory theory as an alternative to the innateness hypothesis. In: Antony LM, Hornstein N (eds) Chomsky and his critics. Blackwell Publishing Ltd, Hoboken, pp 238–254
Gopnik A (2012) Scientific thinking in young children: theoretical advances, empirical research, and policy implications. Science 337:1623–1627
Gopnik A, Glymour C, Sobel DM, Schulz LE, Kushnir T, Danks D (2004) A theory of causal learning in children: causal maps and Bayes nets. Psychol Rev 111:3–32
Gopnik A, Wellman HM (2012) Reconstructing constructivism: causal models, Bayesian learning mechanisms, and the theory theory. Psychol Bull 138:1085–1108
Gordon NJ, Salmond DJ, Smith AFM (1993) Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEEE Proc F Radar Signal Process 140:107–113
Goswami U (Hrsg) (2011) The Wiley-Blackwell handbook of childhood cognitive development. Wiley-Blackwell: Hoboken
Griffiths TL, Canini KR, Sanborn AN, Navarro D (2007) Unifying rational models of categorization via the hierarchical Dirichlet process. In: McNamara DS, Trafton JG (eds) Proceedings of the 29th Annual conference of the Cognitive Science Society. Erlbaum, Hillsdale, NJ, pp 323–328
Griffiths TL, Kemp C, Tenenbaum JB (2008) Bayesian models of cognition. In: Sun R (ed) The Cambridge handbook of computational psychology. Cambridge University Press, Cambridge, pp 59–100
Griffiths TL, Chater N, Kemp C, Perfors A, Tenenbaum JB (2010) Probabilistic models of cognition: exploring representations and inductive biases. Trends Cogn Sci 14:357–364
Griffiths TL, Chater N, Norris D, Pouget A (2012) How the Bayesians got their beliefs (and what those beliefs actually are): Comment on Bowers and Davis (2012). Psychol Bull 138:415–422
Grimmer J (2011) An introduction to bayesian inference via variational approximations. Polit Anal 19:32–47
Holyoak KJ, Cheng PW (2010) Causal learning and inference as a rational process: the new synthesis. Annu Rev Psychol 62:135–163
Huang Y, Rao Rajesh P N (2011) Predictive coding. WIREs Cogn Sci 2:580–593
Jones M, Love BC (2011) Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition. Behav Brain Sci 34:169–188
Kemp C, Perfors A, Tenenbaum JB (2004) Learning domain structures. In: Forbus K, Gentner D, Regier T (eds) Proceedings of the 26th annual conference of the cognitive science society. Lawrence Erlbaum Associates Inc, Mahwah, New Jersey, pp 720–725
Kemp C (2008) The acquisition of inductive constraints. Dissertation. Cambridge
Kemp C, Perfors A, Tenenbaum JB (2007a) Learning overhypotheses with hierarchical Bayesian models. Dev Sci 10:307–321
Kemp C, Tenenbaum JB, Niyogi S, Griffiths TL (2010) A probabilistic model of theory formation. Cognition 114:165–196
Kemp C, Goodman ND, Tenenbaum JB (2007b) Learning causal schemata. In: McNamara DS, Trafton JG (eds) Proceedings of the 29th annual conference of the cognitive science society. Erlbaum, Hillsdale, NJ, pp 389–394
Kruschke JK (2010) Doing Bayesian data analysis. A tutorial with R and BUGS. Academic Press, Burlington, MA
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22:79–86
Kwisthout J (2010) Two new notions of abduction in Bayesian networks. In: Proceedings of the 22nd Benelux conference on artificial intelligence, pp 82–89
Kwisthout J, van Rooij I (2013) Bridging the gap between theory and practice of approximate Bayesian inference. Cognitive Systems Research:Special Issue on ICCM2012 24:2–8
Kwisthout J, Wareham T, van Rooij I (2011) Bayesian intractability is not an ailment that approximation can cure. Cogn Sci 35:779–784
Lee MD (2011) How cognitive modeling can benefit from hierarchical Bayesian models. Spec Issue Hierarchical Bayesian Models 55:1–7
Lien Y, Cheng PW (2000) Distinguishing genuine from spurious causes: a coherence hypothesis. Cogn Psychol 40:87–137
Lindley DV, Smith AFM (1972) Bayes estimates for the linear model. J R Stat Soc Ser B (Methodological) 34:1–41
Love BC, Medin DL, Gureckis TM (2004) SUSTAIN: a network model of category learning. Psychol Rev 111:309–332
Lu H, Yuille AL, Liljeholm M, Cheng PW, Holyoak KJ (2008) Bayesian generic priors for causal learning. Psychol Rev 115:955–984
Lunn D, Thomas A, Best N, Spiegelhalter D (2000) WinBUGS–a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput 10:325–337
Mansinghka V, Kemp C, Griffiths TL, Tenenbaum JB (2006) Structured priors for structure learning. In: Dechter R, Richardson T (eds) Proceedings of the twenty-second conference on uncertainty in artificial intelligence. AUAI Press, Arlington, Virginia, pp 324–331
Marcus GF (2010) Neither size fits all: comment on McClelland, et al and Griffiths et al. Trends Cogn Sci 14:346–347
Marcus GF, Davis E (2013) How robust are probabilistic models of higher-level cognition? Psychol Sci 24:2351–2360
Markson L, Diesendruck G, Bloom P (2008) The shape of thought. Dev Sci 11:204–208
Marr D (1982) Vision: a computational investigation into the human representation and processing of visual information. MIT Press, Cambridge
McClelland JL, Botvinick MM, Noelle DC, Plaut DC, Rogers TT, Seidenberg MS, Smith LB (2010) Letting structure emerge: connectionist and dynamical systems approaches to cognition. Trends Cogn Sci 14:348–356
Milch B, Marthi B, Russell S, Sontag D, Ong DL, Kolobov A (2005) BLOG: probabilistic models with unknown objects. In: Proceedings of the 19th international joint conference on Artificial intelligence (IJCAI’05). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 1352–1359
Navarro DJ, Griffiths TL, Steyvers M, Lee MD (2006) Modeling individual differences using Dirichlet processes. Special Issue on Model Selection: Theoretical Developments and Applications Special Issue on Model Selection: Theoretical Developments and Applications 50:101–122
Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9:249–265
Nosofsky RM (1986) Attention, similarity, and the identification-categorization relationship. J Exp Psychol Gen 115:39–61
Oniśko A, Druzdzel MJ, Wasyluk H (2001) Learning Bayesian network parameters from small data sets: application of Noisy-OR gates. Int J Approx Reason 27:165–182
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Francisco
Perfors AF, Tenenbaum JB, Regier T (2006) Poverty of the stimulus? A rational approach. In: Sun R, Miyake N (eds) Proceedings of the 28th annual conference of the cognitive science society. Lawrence Erlbaum Associates Inc, Mahwah, New Jersey, pp 663–668
Perfors A, Tenenbaum JB, Griffiths TL, Xu F (2011) A tutorial introduction to Bayesian models of cognitive development. Probab Models Cogn Dev 120:302–321
Plummer M (2003) JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In: Hornik K, Leisch F, Zeileis A (ed) Proceedings of the 3rd international workshop on distributed statistical computing, pp 125–134
Pouget A, Beck JM, Ma WJ, Latham PE (2013) Probabilistic brains: knowns and unknowns. Nat Neurosci 16:1170–1178
Ross BH, Makin VS (1999) Prototype versus exemplar models in cognition. In: Sternberg RJ (ed) The nature of cognition. MIT Press, Cambridge, MA, pp 205–241
Sakamoto Y, Jones M, Love B (2008) Putting the psychology back into psychological models: Mechanistic versus rational approaches. Memory Cogn 36:1057–1065
Sanborn AN, Griffiths TL, Navarro DJ (2010) Rational approximations to rational models: alternative algorithms for category learning. Psychol Rev 117:1144–1167
Schmidt LA, Kemp C, Tenenbaum JB (2006) Nonsense and Sensibility: Inferring Unseen Possibilities. In: Sun R, Miyake N (eds) Proceedings of the 28th Annual Conference of the Cognitive Science Society. Lawrence Erlbaum Associates Inc, Mahwah, New Jersey, pp 744–749
Schulz L (2012) The origins of inquiry: inductive inference and exploration in early childhood. Trends Cogn Sci 16:382–389
Shimony SE (1994) Finding MAPs for belief networks is NP-hard. Artif Intell 68:399–410
Sim ZL, Yuan S, Xu F (2011) Acquiring Word Learning Biases. In: Carlson L, Hoelscher C, Shipley TF (eds) Proceedings of the 33th Annual Conference of the Cognitive Science Society. Cognitive Science Society, Austin, TX, pp 2544–2549
Smith JD, Minda JP (1998) Prototypes in the mist: the early epochs of category learning. J Exp Psychol Learn Memory Cogn 24:1411–1436
Smith LB, Jones SS, Landau B, Gershkoff-Stowe L, Samuelson L (2002) Object Name Learning Provides On-the-Job Training for Attention. Psychol Sci 13:13–19
Soja NN, Carey S, Spelke ES (1991) Ontological categories guide young children’s inductions of word meaning: Object terms and substance terms. Cognition 38:179–211
Teh YW (2010) Dirichlet process. In: Sammut C, Webb G (eds) Encyclopedia of machine learning. Springer, US, pp 280–287
Teh YW, Jordan MI, Beal MJ, Blei DM (2004) Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes. In: Saul LK, Weiss Y, Bottou L (ed) Advances in Neural Information Processing Systems 17. Proceedings of the 2004 Conference. MIT Press, Cambridge, MA, pp 1385–1392
Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical dirichlet processes. J Am Stat Assoc 101:1566–1581
Tenenbaum JB, Griffiths TL, Kemp C (2006) Theory-based Bayesian models of inductive learning and reasoning. Special issue: Probabilistic models of cognition 10:309–318
Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND (2011) How to Grow a Mind: Statistics, Structure, and Abstraction. Science 331:1279–1285
Thomson R, Lebiere C (2013) Constraining Bayesian inference with cognitive architectures: an updated associative learning mechanism in ACT-R. In: Knauf Markus, Pauen Michael, Sebanz Natalie, Wachsmuth Ipke (eds) Proceedings of the 35th Annual Meeting of the Cognitive Science Society. Cognitive Science Society, Austin, TX, pp 3539–3544
Tversky A, Kahneman D (1983) Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychol Rev 90:293–315
West R, Stanovich K (2003) Is probability matching smart? Associations between probabilistic choices and cognitive ability. Memory Cogn 31:243–251
Wills AJ, Pothos EM (2012) On the adequacy of current empirical evaluations of formal models of categorization. Psychol Bull 138:102–125
Xu F, Tenenbaum JB (2007) Word learning as Bayesian inference. Psychol Rev 114:245–272
Acknowledgments
The authors would like to thank the anonymous reviewers for their insightful advice and helpful suggestions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Glassen, T., Nitsch, V. Hierarchical Bayesian models of cognitive development. Biol Cybern 110, 217–227 (2016). https://doi.org/10.1007/s00422-016-0686-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00422-016-0686-6