Abstract
Although many nonlinear models of cognition have been proposed in the past 50 years, there has been little consideration of corresponding statistical techniques for their analysis. In analyses with nonlinear models, unmodeled variability from the selection of items or participants may lead to asymptotically biased estimation. This asymptotic bias, in turn, renders inference problematic. We show, for example, that a signal detection analysis of recognition memory data leads to asymptotic underestimation of sensitivity. To eliminate asymptotic bias, we advocate hierarchical models in which participant variability, item variability, and measurement error are modeled simultaneously. By accounting for multiple sources of variability, hierarchical models yield consistent and accurate estimates of participant and item effects in recognition memory. This article is written in tutorial format; we provide an introduction to Bayesian statistics, hierarchical modeling, and Markov chain Monte Carlo computational techniques.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Ahrens, J. H., &Dieter, U. (1974). Computer methods for sampling from gamma, beta, Poisson and binomial distributions. Computing,12, 223–246.
Ahrens, J. H., &Dieter, U. (1982). Generating gamma variates by a modified rejection technique.Communications of the Association for Computing Machinery,25, 47–54.
Albert, J., &Chib, S. (1995). Bayesian residual analysis for binary response regression models.Biometrika,82, 747–759.
Ashby, F. G. (1992). Multivariate probability distributions. In F. G. Ashby (Ed.),Multidimensional models of perception and cognition (pp. 1–34). Hillsdale, NJ: Erlbaum.
Ashby, F. G., Maddox, W. T., &Lee, W.W. (1994). On the dangers of averaging across subjects when using multidimensional scaling or the similarity-choice model.Psychological Science,5, 144–151.
Baayen, R. H., Tweedie, F. J., &Schreuder, R. (2002). The subjects as a simple random effect fallacy: Subject variability and morphological family effects in the mental lexicon.Brain & Language,81, 55–65.
Bayes, T. (1763). An essay towards solving a problem in the doctrine of chances.Philosophical Transactions of the Royal Society of London,53, 370–418.
Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research.Journal of Verbal Learning & Verbal Behavior,12, 335–359.
Cohen, J. (1994). The earth is round ( p.05).American Psychologist,49, 997–1003.
Cumming, G., &Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncentral distributions.Educational & Psychological Measurement,61, 532–574.
Curran, T. C., &Hintzman, D. L. (1995). Violations of the independence assumption in process dissociation.Journal of Experimental Psychology: Learning, Memory, & Cognition,21, 531–547.
Egan, J. P. (1975).Signal detection theory and ROC analysis. New York: Academic Press.
Einstein, G. O., McDaniel, M. A., &Lackey, S. (1989). Bizarre imagery, interference, and distinctiveness.Journal of Experimental Psychology: Learning, Memory, & Cognition,15, 137–146.
Estes, W. K. (1956). The problem of inference from curves based on grouped data.Psychological Bulletin,53, 134–140.
Forster, K. I., &Dickinson, R. G. (1976). More on the language-asfixed-effect fallacy: Monte Carlo estimates of error rates forF1, F2, F’, and min F’. Journal of Verbal Learning & Verbal Behavior,15, 135–142.
Gelfand, A. E., &Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities.Journal of the American Statistical Association,85, 398–409.
Gelman, A., Carlin, J. B., Stern, H. S., &Rubin, D. B. (2004).Bayesian data analysis (2nd ed.). London: Chapman & Hall.
Gelman, A., &Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences (with discussion).Statistical Science,7, 457–511.
Geman, S., &Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images.IEEE Transactions on Pattern Analysis & Machine Intelligence,6, 721–741.
Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.),Bayesian statistics (Vol. 4, pp. 169–194). Oxford: Oxford University Press, Clarendon Press.
Gilks, W. R., Richardson, S. E., &Spiegelhalter, D. J. (1996).Markov chain Monte Carlo in practice. London: Chapman & Hall.
Gill, J. (2002).Bayesian methods: A social and behavioral sciences approach. London: Chapman & Hall.
Gilmore, G. C., Hersh, H., Caramazza, A., &Griffin, J. (1979). Multidimensional letter similarity derived from recognition errors.Perception & Psychophysics,25, 425–431.
Glanzer, M., Adams, J. K., Iverson, G. J., &Kim, K. (1993). The regularities of recognition memory.Psychological Review,100, 546–567.
Green, D. M., &Swets, J. A. (1966).Signal detection theory and psychophysics. New York: Wiley.
Greenwald, A. G., Draine, S. C., &Abrams, R. L. (1996). Three cognitive markers of unconscious semantic activation.Science,273, 1699–1702.
Haider, H., &Frensch, P. A. (2002). Why aggregated learning follows the power law of practice when individual learning does not: Comment on Rickard (1997, 1999), Delaney et al. (1998), and Palmeri (1999).Journal of Experimental Psychology: Learning, Memory, & Cognition,28, 392–406.
Heathcote, A., Brown, S., &Mewhort, D. J. K. (2000). The power law repealed: The case for an exponential law of practice.Psychonomic Bulletin & Review,7, 185–207.
Hirshman, E., Whelley, M. M., &Palij, M. (1989). An investigation of paradoxical memory effects.Journal of Memory & Language,28, 594–609.
Hobert, J. P., &Casella, G. (1996). The effect of improper priors on Gibbs sampling in hierarchical linear mixed models.Journal of the American Statistical Association,91, 1461–1473.
Hohle, R. H. (1965). Inferred components of reaction time as a function of foreperiod duration.Journal of Experimental Psychology,69, 382–386.
Hunter, J. E. (1997). Needed: A ban on the significance test.Psychological Science,8, 3–7.
Jacoby, L. L. (1991). A process dissociation framework: Separating automatic from intentional uses of memory.Journal of Memory & Language,30, 513–541.
Jeffreys, H. (1961).Theory of probability (3rd ed.). New York: Oxford University Press.
Kass, R. E., &Raftery, A. E. (1995). Bayes factors.Journal of the American Statistical Association,90, 773–795.
Kreft, I., &de Leeuw, J. (1998).Introducing multilevel modeling. London: Sage.
Lee, M. D., &Wagenmakers, E. J. (2005). Bayesian statistical inference in psychology: Comment on Trafimow (2003).Psychological Review,112, 662–668.
Lee, M. D., & Webb, M. R. (2005). Modeling individual differences in cognition. Manuscript submitted for publication.
Lu, J. (2004).Bayesian hierarchical models for process dissociation framework in memory research. Unpublished manuscript.
Luce, R. D. (1963). Detection and recognition. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.),Handbook of mathematical psychology (Vol. 1, pp. 103–189). New York: Wiley.
Macmillan, N. A., &Creelman, C. D. (1991).Detection theory: A user’s guide. Cambridge: Cambridge University Press.
Massaro, D. W., &Oden, G. C. (1979). Integration of featural information in speech perception.Psychological Review,85, 172–191.
McClelland, J. L., &Rumelhart, D. E. (1981). An interactive activation model of context effects in letter perception: I. An account of basic findings.Psychological Review,88, 375–407.
Medin, D. L., &Schaffer, M. M. (1978). Context theory of classification learning.Psychological Review,85, 207–238.
Meng, X., &Wong, W. H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration.Statistica Sinica,6, 831–860.
Nosofsky, R. M. (1986). Attention, similarity, and the identification-categorization relationship.Journal of Experimental Psychology: General,115, 39–57.
Pitt, M. A., Myung, I.-J., &Zhang, S. (2003). Toward a method of selecting among computational models of cognition.Psychological Review,109, 472–491.
Pra Baldi, A., de Beni, R., Cornoldi, C., &Cavedon, A. (1985). Some conditions of the occurrence of the bizarreness effect in recall.British Journal of Psychology,76, 427–436.
Pruzek, R. M. (1997). An introduction to Bayesian inference and its applications. In L. Harlow, S. Mulaik, & J. Steiger (Eds.),What if there were no significance tests? (pp. 221–257). Mahwah, NJ: Erlbaum.
Raaijmakers, J. G. W., Schrijnemakers, J. M. C., &Gremmen, F. (1999). How to deal with “the language-as-fixed-effect fallacy”: Common misconceptions and alternative solutions.Journal of Memory & Language,41, 416–426.
Raftery, A. E., &Lewis, S. M. (1992). One long run with diagnostics: Implementation strategies for Markov chain Monte Carlo.Statistical Science,7, 493–497.
Ratcliff, R. (1978). A theory of memory retrieval.Psychological Review,85, 59–108.
Ratcliff, R., &Rouder, J. N. (1998). Modeling response times for decisions between two choices.Psychological Science,9, 347–356.
Ratcliff, R., Sheu, C.-F., &Gronlund, S. D. (1992). Testing global memory models using ROC curves.Psychological Review,99, 518–535.
Riefer, D. M., &Rouder, J. N. (1992). A multinomial modeling analysis of the mnemonic benefits of bizarre imagery.Memory & Cognition,20, 601–611.
Rindskopf, R. M. (1997). Testing “small,” not null, hypotheses: Classical and Bayesian approaches. In L. Harlow, S. Mulaik, & J. Steiger (Eds.),What if there were no significance tests? (pp. 221–257). Mahwah, NJ: Erlbaum.
Roberts, G. O., &Sahu, S. K. (1997). Updating schemes, correlation structure, blocking and parameterization for the Gibbs sampler.Journal of the Royal Statistical Society: Series B,59, 291–317.
Rouder, J. N. (2000). Assessing the roles of change discrimination and luminance integration: Evidence for a hybrid race model of perceptual decision making in luminance discrimination.Journal of Experimental Psychology: Human Perception & Performance,26, 359–378.
Rouder, J. N., Lu, J., Speckman, P. [L.], Sun, D., &Jiang, Y. (2005). A hierarchical model for estimating response time distributions.Psychonomic Bulletin & Review,12, 195–223.
Rouder, J. N., Sun, D., Speckman, P. L., Lu, J., &Zhou, D. (2003). A hierarchical Bayesian statistical framework for response time distributions.Psychometrika,68, 589–606.
Rozeboom, W. W. (1960). The fallacy of the null-hypothesis significance test.Psychological Bulletin,57, 416–428.
Smithson, M. (2001). Correct confidence intervals for various regression effect sizes and parameters: The importance of noncentral distributions in computing intervals.Educational & Psychological Measurement,61, 605–632.
Steiger, J. H., &Fouladi, R. T. (1997). Noncentrality interval estimation and the evaluation of statistical models. In L. Harlow, S. Mulaik, & J. Steiger (Eds.),What if there were no significance tests? (pp. 221–257). Mahwah, NJ: Erlbaum.
Tanner, M. A. (1996).Tools for statistical inference: Methods for the exploration of posterior distributions and likelihood functions (3rd ed.). New York: Springer.
Tierney, L. (1994). Markov chains for exploring posterior distributions.Annals of Statistics,22, 1701–1728.
Wickelgren, W. A. (1968). Unidimensional strength theory and component analysis of noise in absolute and comparative judgments.Journal of Mathematical Psychology,5, 102–122.
Wishart, J. (1928). A generalized product moment distribution in samples from normal multivariate population.Biometrika,20, 32–52.
Wollen, K. A., &Cox, S. D. (1981). Sentence cueing and the effect of bizarre imagery.Journal of Experimental Psychology: Human Learning & Memory,7, 386–392.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research is supported by NSF Grant SES-0095919 to J.N.R., Dongchu Sun, and Paul Speckman.
Rights and permissions
About this article
Cite this article
Rouder, J.N., Lu, J. An introduction to Bayesian hierarchical models with an application in the theory of signal detection. Psychonomic Bulletin & Review 12, 573–604 (2005). https://doi.org/10.3758/BF03196750
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BF03196750