Abstract
The discriminability measure \(d'\) is widely used in psychology to estimate sensitivity independently of response bias. The conventional approach to estimate \(d'\) involves a transformation from the hit rate and the false-alarm rate. When performance is perfect, correction methods must be applied to calculate \(d'\), but these corrections distort the estimate. In three simulation studies, we show that distortion in \(d'\) estimation can arise from other properties of the experimental design (number of trials, sample size, sample variance, task difficulty) that, when combined with application of the correction method, make \(d'\) distortion in any specific experiment design complex and can mislead statistical inference in the worst cases (Type I and Type II errors). To address this problem, we propose that researchers simulate \(d'\) estimation to explore the impact of design choices, given anticipated or observed data. An R Shiny application is introduced that estimates \(d'\) distortion, providing researchers the means to identify distortion and take steps to minimize its impact.
Similar content being viewed by others
Supplementary material
The supplemental materials for simulations can be found at https://github.com/Van-Zandt-Lab-at-OSU/Estimation-of-d-prime
Open Practices Statement
This study does not include empirical data. The R code used to produce simulated data are available at https://github.com/Van-Zandt-Lab-at-OSU/Estimation-of-d-prime.
Notes
All values used in Simulation 3 were specified in Simulation 3.
See the Supplemental materials for negative values of c.
Recall that the effects of \(c \ne 0\) are symmetric (see Fig. 4), for any effects observed for \(c=.2\) or .4, we would also expect to see the same effects for \(c=-.2\) or \(-.4\)
References
Balakrishnan, J. (1999). Decision processes in discrimination: Fundamental misrepresentations of signal detection theory. Journal of Experimental Psychology: Human Perception and Performance, 25(5), 1189.
Bodner, G. E., Taikh, A., & Fawcett, J. M. (2014). Assessing the costs and benefits of production in recognition. Psychonomic Bulletin & Review, 21, 149–154.
Brown, G. S., & White, K. G. (2005). The optimal correction for estimating extreme discriminability. Behavior research methods, 37(3), 436–449.
Correll, J., Wittenbrink, B., Park, B., Judd, C. M., & Goyle, A. (2011). Dangerous enough: Moderating racial bias with contextual threat cues. Journal of Experimental Social Psychology, 47(1), 184–189.
Elsherif, M. M., Saban, M. I., & Rotshtein, P. (2017). The perceptual saliency of fearful eyes and smiles: A signal detection study. PloS one, 12(3), e0173199.
Goldstein, E. B. (2014). Cognitive psychology: Connecting mind, research and everyday experience. Cengage Learning.
Goodman, L. A. (1970). The multivariate analysis of qualitative data: Interactions among multiple classifications. Journal of the American Statistical Association, 65(329), 226–256.
Green, D. M., Swets, J. A., et al. (1966). Signal detection theory and psychophysics (vol. 1). Wiley New York.
Hautus, M. J. (1995). Corrections for extreme proportions and their biasing effects on estimated values of d’. Behavior Research Methods, Instruments, & Computers, 27, 46–51.
Hautus, M. J. (1997). Calculating estimates of sensitivity from group data: Pooled versus averaged estimators. Behavior Research Methods, Instruments, & Computers, 29(4), 556–562.
Hautus, M. J., & Lee, A. (2006). Estimating sensitivity and bias in a yes/no task. British Journal of Mathematical and Statistical Psychology, 59(2), 257–273.
Hautus, M. J., & Lee, A. J. (1998). The dispersions of estimates of sensitivity obtained from four psychophysical procedures: Implications for experimental design. Perception & Psychophysics, 60(4), 638–649.
Jesteadt, W. (2005). The variance of d’ estimates obtained in yes-no and two-interval forced choice procedures. Perception & psychophysics, 67(1), 72–80.
Kadlec, H. (1999). Statistical properties of d’and \(\beta \) estimates of signal detection theory. Psychological Methods, 4(1), 22.
Lee, M. D. (2008). Bayessdt: Software for bayesian inference with signal detection theory. Behavior Research Methods, 40, 450–456.
Lewis, F. C., Reeve, R. A., Kelly, S. P., & Johnson, K. A. (2017). Sustained attention to a predictable, unengaging go/no-go task shows ongoing development between 6 and 11 years. Attention, Perception, & Psychophysics, 79, 1726–1741.
Macmillan, N.A., & Creelman, C.D. (2004). Detection theory: A user’s guide. Psychology press.
Macmillan, N. A., & Kaplan, H. L. (1985). Detection theory analysis of group data: Estimating sensitivity from average hit and false-alarm rates. Psychological Bulletin, 98(1), 185.
Makowski, D. (2018). The psycho package: An efficient and publishing-oriented workflow for psychological science. Journal of Open Source Software, 3(22), 470.
McNeish, D. (2016). On using bayesian methods to address small sample problems. Structural Equation Modeling: A Multidisciplinary Journal, 23(5), 750–773.
Miller, J. (1996). The sampling distribution of d’. Perception & Psychophysics, 58(1), 65–72.
Murdock, B. B., Jr., & Ogilvie, J. C. (1968). Binomial variability in short-term memory. Psychological Bulletin, 70(4), 256.
Park, G. D., & Reed, C. L. (2015). Haptic over visual information in the distribution of visual attention after tool-use in near and far space. Experimental Brain Research, 233, 2977–2988.
Paulewicz, B., & Blaut, A. (2020). The bhsdtr package: A general-purpose method of bayesian inference for signal detection theory models. Behavior Research Methods, 52, 2122–2141.
Pek, J., Pitt, M. A., & Wegener, D. T. (2022). Uncertainty limits the use of power analysis. Journal of Experimental Psychology: General, 153, 1139.
Rhodes, S., Cowan, N., Parra, M. A., & Logie, R. H. (2019). Interaction effects on common measures of sensitivity: Choice of measure, type i error, and power. Behavior Research Methods, 51, 2209–2227.
Rotello, C. M., Heit, E., & Dubé, C. (2015). When more data steer us wrong: Replications with the wrong dependent measure perpetuate erroneous conclusions. Psychonomic Bulletin & Review, 22, 944–954.
Rotello, C. M., Masson, M. E., & Verde, M. F. (2008). Type i error rates and power analyses for single-point sensitivity measures. Perception & Psychophysics, 70(2), 389–401.
Schooler, L. J., & Shiffrin, R. M. (2005). Efficiently measuring recognition performance with sparse data. Behavior Research Methods, 37, 3–10.
Senay, I., Usak, M., & Prokop, P. (2015). Talking about behaviors in the passive voice increases task performance. Applied Cognitive Psychology, 29(2), 262–270.
Snodgrass, J. G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia. Journal of Experimental Psychology: General, 117(1), 34.
Varghese, L., Bharadwaj, H. M., & Shinn-Cunningham, B. G. (2015). Evidence against attentional state modulating scalp-recorded auditory brainstem steady-state responses. Brain Research, 1626, 146–164.
Verde, M. F., Macmillan, N. A., & Rotello, C. M. (2006). Measures of sensitivity based on a single hit rate and false alarm rate: The accuracy, precision, and robustness of d’, a z, and a’. Perception & Psychophysics, 68, 643–654.
Wixted, J.T. (2005). Signal detection theory. Encyclopedia of Statistics in Behavioral Science,
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
This material is based upon work performed while Dr. Trisha Van Zandt was serving at the National Science Foundation. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors have no conflicts of interest to disclose. The authors did not receive support for the submitted work. Drs. Mark Pitt and Trisha Van Zandt contributed equally.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix 1: Why increasing the sample size does not reduce \(d'\) distortion
Appendix 1: Why increasing the sample size does not reduce \(d'\) distortion
In this appendix, we explain why increasing the sample size (more participants) would exacerbate the distortion problem in \(d'\) estimation when mathematical corrections are applied. Assume that a group of M participants each completed \(N_s\) signal trials and \(N_n\) noise trials. In the ideal situation, we hope that the average of their corresponding \(d'\) estimates (\(\hat{d'_1}\), \(\hat{d'_2}\), ..., \(\hat{d'_M}\)) converges to the true group mean \(d'\) as \(M \rightarrow \infty \), following the central limit theorem.
However, the distribution of \(\hat{d'}\), computed as a transformation of \(p_{\text {hit}}\) (hit rate) and \(p_{\text {false alarm}}\) (false-alarm rate) in Eq. 1, does not satisfy the conditions for the central limit theorem for independent, non-identical distributions to hold when the trial sizes are fixed at \(N_s\) and \(N_n\). For Participant i with true SDT parameters of \(d'_i\) and \(c_i\), the distribution of hit rate (\(p_{\text {hit},i}\)) is
where \(\Phi \) is the Gaussian CDF. The mean of \(p_{\text {hit},i}\), \(\Phi (\frac{d'_i}{2} - c_i)\), is the probability to correctly identify each signal trial. Similarly, for noise trials, the mean of correct rejection rate is \(\Phi (\frac{d'_i}{2} + c_i)\), thus the mean of false-alarm rate, \(p_{\text {false alarm},i}\) is \(1-\Phi (\frac{d'_i}{2} + c_i)\). Ideally, using Eq. 1 for \(d'\) estimation,
where z is the inverse of Gaussian CDF, we hope to achieve
However, with a finite trial number \(N_s\), \(P(p_{\text {hit}} =1)>0\), thus there is always a non-zero probability of perfect performance when identifying signals. Similarly, \(P(p_{\text {false alarm}}=0)>0\) when \(N_n\) is finite, thus there is always a non-zero probability of perfect performance when identifying noise. Because the inverse of the Gaussian CDF z(p) in Eq. 1 has a support of (0, 1) and is not defined when \(p=0\) or \(p=1\), the distributions of \(\hat{d'}_i\) (\(i=1,2,...,M\)) is ill-defined when \(p_{\text {hit}}\) or \(p_{\text {false alarm}}\) are 0 or 1. As a result, Eq. 3 does not hold, and there is no finite expected value for \(\hat{d'}_i\). Consequently, \(\hat{d'_1}\), \(\hat{d'_2}\), ..., \(\hat{d'_M}\) do not satisfy the conditions for the central limit theorem for independent, non-identical distributions to hold because they do not have well-defined, finite expected values or variances.
After applying the mathematical corrections, the distributions of \(\hat{d'}_r\) or \(\hat{d'}_{ll}\) do have finite expected values and finite variances. Thus they satisfy the conditions for the central limit theorem to hold. However, the mathematical correction distorts the distributions of \(\hat{d'}_r\) or \(\hat{d'}_{ll}\). For example, using replacement with a correction value of a, the distribution of hit rate for Participant i is corrected to
and
Thus the mean of \(p_{\text {hit},r,i}\) is no longer \(\Phi (\frac{d'_i}{2} - c_i)\), but is corrected to
Similarly, the mean of \(p_{\text {false alarm},r,i}\) is no longer \(1-\Phi (\frac{d'_i}{2} + c_i)\), but is corrected to
Therefore,
As a result, the average of \(\hat{d'_{r,1}}\), \(\hat{d'_{r,2}}\), ..., \(\hat{d'_{r,3}}\) does not converge to the true group mean \(d'\) as \(M \rightarrow \infty \): the mean is asymptotically biased. Instead, it converges to a different value different from the true mean \(d'\), where the amount of bias is determined by both the correction method and other experimental parameters shown in Eq. 4. The same issue applies to \(\hat{d'}_{ll}\). Examples of values that they converge to are shown in Fig. 5, demonstrating that these values can have a large difference from the true mean \(d'\) in many cases. As a result, increasing the sample size would exacerbate distortion problems in \(d'\) estimation caused by mathematical correction.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Y., Daly, H.R., Pitt, M.A. et al. Assessing the distortions introduced when calculating d’: A simulation approach. Behav Res 56, 7728–7747 (2024). https://doi.org/10.3758/s13428-024-02447-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13428-024-02447-8