Assessing the distortions introduced when calculating d’: A simulation approach

Yiyang Chen¹,
Heather R. Daly²,
Mark A. Pitt² &
…
Trisha Van Zandt²

278 Accesses
Explore all metrics

A Correction to this article was published on 27 February 2025

This article has been updated

Abstract

The discriminability measure $d'$ is widely used in psychology to estimate sensitivity independently of response bias. The conventional approach to estimate $d'$ involves a transformation from the hit rate and the false-alarm rate. When performance is perfect, correction methods must be applied to calculate $d'$, but these corrections distort the estimate. In three simulation studies, we show that distortion in $d'$ estimation can arise from other properties of the experimental design (number of trials, sample size, sample variance, task difficulty) that, when combined with application of the correction method, make $d'$ distortion in any specific experiment design complex and can mislead statistical inference in the worst cases (Type I and Type II errors). To address this problem, we propose that researchers simulate $d'$ estimation to explore the impact of design choices, given anticipated or observed data. An R Shiny application is introduced that estimates $d'$ distortion, providing researchers the means to identify distortion and take steps to minimize its impact.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Recovering the variance of d' from hit and false alarm statistics

Article 25 January 2019

Too much model, too little data: How a maximum-likelihood fit of a psychometric function may fail, and how to detect and avoid this

Article 15 March 2019

Interaction effects on common measures of sensitivity: choice of measure, type I error, and power

Article 18 July 2018

Supplementary material

The supplemental materials for simulations can be found at https://github.com/Van-Zandt-Lab-at-OSU/Estimation-of-d-prime

Open Practices Statement

This study does not include empirical data. The R code used to produce simulated data are available at https://github.com/Van-Zandt-Lab-at-OSU/Estimation-of-d-prime.

Change history

27 February 2025
A Correction to this paper has been published: https://doi.org/10.3758/s13428-025-02617-2

Notes

All values used in Simulation 3 were specified in Simulation 3.
See the Supplemental materials for negative values of c.
Recall that the effects of $c \ne 0$ are symmetric (see Fig. 4), for any effects observed for $c=.2$ or .4, we would also expect to see the same effects for $c=-.2$ or $-.4$

References

Balakrishnan, J. (1999). Decision processes in discrimination: Fundamental misrepresentations of signal detection theory. Journal of Experimental Psychology: Human Perception and Performance, 25(5), 1189.
Bodner, G. E., Taikh, A., & Fawcett, J. M. (2014). Assessing the costs and benefits of production in recognition. Psychonomic Bulletin & Review, 21, 149–154.
Article Google Scholar
Brown, G. S., & White, K. G. (2005). The optimal correction for estimating extreme discriminability. Behavior research methods, 37(3), 436–449.
Article PubMed Google Scholar
Correll, J., Wittenbrink, B., Park, B., Judd, C. M., & Goyle, A. (2011). Dangerous enough: Moderating racial bias with contextual threat cues. Journal of Experimental Social Psychology, 47(1), 184–189.
Article PubMed PubMed Central Google Scholar
Elsherif, M. M., Saban, M. I., & Rotshtein, P. (2017). The perceptual saliency of fearful eyes and smiles: A signal detection study. PloS one, 12(3), e0173199.
Article PubMed PubMed Central Google Scholar
Goldstein, E. B. (2014). Cognitive psychology: Connecting mind, research and everyday experience. Cengage Learning.
Google Scholar
Goodman, L. A. (1970). The multivariate analysis of qualitative data: Interactions among multiple classifications. Journal of the American Statistical Association, 65(329), 226–256.
Article Google Scholar
Green, D. M., Swets, J. A., et al. (1966). Signal detection theory and psychophysics (vol. 1). Wiley New York.
Hautus, M. J. (1995). Corrections for extreme proportions and their biasing effects on estimated values of d’. Behavior Research Methods, Instruments, & Computers, 27, 46–51.
Article Google Scholar
Hautus, M. J. (1997). Calculating estimates of sensitivity from group data: Pooled versus averaged estimators. Behavior Research Methods, Instruments, & Computers, 29(4), 556–562.
Article Google Scholar
Hautus, M. J., & Lee, A. (2006). Estimating sensitivity and bias in a yes/no task. British Journal of Mathematical and Statistical Psychology, 59(2), 257–273.
Article PubMed Google Scholar
Hautus, M. J., & Lee, A. J. (1998). The dispersions of estimates of sensitivity obtained from four psychophysical procedures: Implications for experimental design. Perception & Psychophysics, 60(4), 638–649.
Article Google Scholar
Jesteadt, W. (2005). The variance of d’ estimates obtained in yes-no and two-interval forced choice procedures. Perception & psychophysics, 67(1), 72–80.
Article Google Scholar
Kadlec, H. (1999). Statistical properties of d’and $\beta $ estimates of signal detection theory. Psychological Methods, 4(1), 22.
Article Google Scholar
Lee, M. D. (2008). Bayessdt: Software for bayesian inference with signal detection theory. Behavior Research Methods, 40, 450–456.
Article PubMed Google Scholar
Lewis, F. C., Reeve, R. A., Kelly, S. P., & Johnson, K. A. (2017). Sustained attention to a predictable, unengaging go/no-go task shows ongoing development between 6 and 11 years. Attention, Perception, & Psychophysics, 79, 1726–1741.
Article Google Scholar
Macmillan, N.A., & Creelman, C.D. (2004). Detection theory: A user’s guide. Psychology press.
Macmillan, N. A., & Kaplan, H. L. (1985). Detection theory analysis of group data: Estimating sensitivity from average hit and false-alarm rates. Psychological Bulletin, 98(1), 185.
Makowski, D. (2018). The psycho package: An efficient and publishing-oriented workflow for psychological science. Journal of Open Source Software, 3(22), 470.
Article Google Scholar
McNeish, D. (2016). On using bayesian methods to address small sample problems. Structural Equation Modeling: A Multidisciplinary Journal, 23(5), 750–773.
Miller, J. (1996). The sampling distribution of d’. Perception & Psychophysics, 58(1), 65–72.
Murdock, B. B., Jr., & Ogilvie, J. C. (1968). Binomial variability in short-term memory. Psychological Bulletin, 70(4), 256.
Park, G. D., & Reed, C. L. (2015). Haptic over visual information in the distribution of visual attention after tool-use in near and far space. Experimental Brain Research, 233, 2977–2988.
Paulewicz, B., & Blaut, A. (2020). The bhsdtr package: A general-purpose method of bayesian inference for signal detection theory models. Behavior Research Methods, 52, 2122–2141.
Pek, J., Pitt, M. A., & Wegener, D. T. (2022). Uncertainty limits the use of power analysis. Journal of Experimental Psychology: General, 153, 1139.
Article Google Scholar
Rhodes, S., Cowan, N., Parra, M. A., & Logie, R. H. (2019). Interaction effects on common measures of sensitivity: Choice of measure, type i error, and power. Behavior Research Methods, 51, 2209–2227.
Article PubMed Google Scholar
Rotello, C. M., Heit, E., & Dubé, C. (2015). When more data steer us wrong: Replications with the wrong dependent measure perpetuate erroneous conclusions. Psychonomic Bulletin & Review, 22, 944–954.
Article Google Scholar
Rotello, C. M., Masson, M. E., & Verde, M. F. (2008). Type i error rates and power analyses for single-point sensitivity measures. Perception & Psychophysics, 70(2), 389–401.
Article Google Scholar
Schooler, L. J., & Shiffrin, R. M. (2005). Efficiently measuring recognition performance with sparse data. Behavior Research Methods, 37, 3–10.
Article PubMed Google Scholar
Senay, I., Usak, M., & Prokop, P. (2015). Talking about behaviors in the passive voice increases task performance. Applied Cognitive Psychology, 29(2), 262–270.
Snodgrass, J. G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia. Journal of Experimental Psychology: General, 117(1), 34.
Varghese, L., Bharadwaj, H. M., & Shinn-Cunningham, B. G. (2015). Evidence against attentional state modulating scalp-recorded auditory brainstem steady-state responses. Brain Research, 1626, 146–164.
Article PubMed PubMed Central Google Scholar
Verde, M. F., Macmillan, N. A., & Rotello, C. M. (2006). Measures of sensitivity based on a single hit rate and false alarm rate: The accuracy, precision, and robustness of d’, a z, and a’. Perception & Psychophysics, 68, 643–654.
Wixted, J.T. (2005). Signal detection theory. Encyclopedia of Statistics in Behavioral Science,

Download references

Author information

Authors and Affiliations

Department of Psychology, University of Kansas, Lawrence, Kansas, USA
Yiyang Chen
Department of Psychology, The Ohio State University, Columbus, Ohio, USA
Heather R. Daly, Mark A. Pitt & Trisha Van Zandt

Authors

Yiyang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Heather R. Daly
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Pitt
View author publications
You can also search for this author in PubMed Google Scholar
Trisha Van Zandt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiyang Chen.

Ethics declarations

This material is based upon work performed while Dr. Trisha Van Zandt was serving at the National Science Foundation. Any opinion, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors have no conflicts of interest to disclose. The authors did not receive support for the submitted work. Drs. Mark Pitt and Trisha Van Zandt contributed equally.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 295 KB)

Appendix 1: Why increasing the sample size does not reduce $d'$ distortion

In this appendix, we explain why increasing the sample size (more participants) would exacerbate the distortion problem in $d'$ estimation when mathematical corrections are applied. Assume that a group of M participants each completed $N_s$ signal trials and $N_n$ noise trials. In the ideal situation, we hope that the average of their corresponding $d'$ estimates ($\hat{d'_1}$, $\hat{d'_2}$, ..., $\hat{d'_M}$) converges to the true group mean $d'$ as $M \rightarrow \infty $, following the central limit theorem.

However, the distribution of $\hat{d'}$, computed as a transformation of $p_{\text {hit}}$ (hit rate) and $p_{\text {false alarm}}$ (false-alarm rate) in Eq. 1, does not satisfy the conditions for the central limit theorem for independent, non-identical distributions to hold when the trial sizes are fixed at $N_s$ and $N_n$. For Participant i with true SDT parameters of $d'_i$ and $c_i$, the distribution of hit rate ($p_{\text {hit},i}$) is

$$\begin{aligned} P(p_{\text {hit},i}\!=\! & \frac{k}{N_s}) \!=\! {N_s \atopwithdelims ()k} \Phi (\frac{d'_i}{2} \!-\! c_i)^k \big \{1-\Phi (\frac{d'_i}{2} \!-\! c_i)\big \}^{N_s-k},\\ k= & 0,1,...,N_s, \end{aligned}$$

where $\Phi $ is the Gaussian CDF. The mean of $p_{\text {hit},i}$, $\Phi (\frac{d'_i}{2} - c_i)$, is the probability to correctly identify each signal trial. Similarly, for noise trials, the mean of correct rejection rate is $\Phi (\frac{d'_i}{2} + c_i)$, thus the mean of false-alarm rate, $p_{\text {false alarm},i}$ is $1-\Phi (\frac{d'_i}{2} + c_i)$. Ideally, using Eq. 1 for $d'$ estimation,

$$\begin{aligned} \hat{d'}_i = z(p_{\text {hit},i}) - z(p_{\text {false alarm},i}), \end{aligned}$$

where z is the inverse of Gaussian CDF, we hope to achieve

$$\begin{aligned} \begin{aligned} \mathbb {E}(\hat{d'}_i)&= z(\mathbb {E}(p_{\text {hit},i})) - z(\mathbb {E}(p_{\text {false alarm},i})) \\&= z(\Phi (\frac{d'_i}{2} - c_i)) + z(\Phi (\frac{d'_i}{2} + c_i)) \\&= d'_i. \end{aligned} \end{aligned}$$

(3)

However, with a finite trial number $N_s$, $P(p_{\text {hit}} =1)>0$, thus there is always a non-zero probability of perfect performance when identifying signals. Similarly, $P(p_{\text {false alarm}}=0)>0$ when $N_n$ is finite, thus there is always a non-zero probability of perfect performance when identifying noise. Because the inverse of the Gaussian CDF z(p) in Eq. 1 has a support of (0, 1) and is not defined when $p=0$ or $p=1$, the distributions of $\hat{d'}_i$ ($i=1,2,...,M$) is ill-defined when $p_{\text {hit}}$ or $p_{\text {false alarm}}$ are 0 or 1. As a result, Eq. 3 does not hold, and there is no finite expected value for $\hat{d'}_i$. Consequently, $\hat{d'_1}$, $\hat{d'_2}$, ..., $\hat{d'_M}$ do not satisfy the conditions for the central limit theorem for independent, non-identical distributions to hold because they do not have well-defined, finite expected values or variances.

After applying the mathematical corrections, the distributions of $\hat{d'}_r$ or $\hat{d'}_{ll}$ do have finite expected values and finite variances. Thus they satisfy the conditions for the central limit theorem to hold. However, the mathematical correction distorts the distributions of $\hat{d'}_r$ or $\hat{d'}_{ll}$. For example, using replacement with a correction value of a, the distribution of hit rate for Participant i is corrected to

$$\begin{aligned} P(p_{\text {hit},r,i}\!=\! & \frac{k}{N_s}) \!=\! {N_s \atopwithdelims ()k} \Phi (\frac{d'_i}{2} \!-\! c_i)^k \big \{1\!-\!\Phi (\frac{d'_i}{2} \!-\! c_i)\big \}^{N_s-k},\\ k= & 1,2,...,N_s-1, \end{aligned}$$

$$ P(p_{\text {hit},r,i} = 1-\frac{a}{N_s}) = \Phi (\frac{d'_i}{2} - c_i)^{N_s}, $$

and

$$ P(p_{\text {hit},r,i} = \frac{a}{N_s}) = \big \{1-\Phi (\frac{d'_i}{2} - c_i)\big \}^{N_s}. $$

Thus the mean of $p_{\text {hit},r,i}$ is no longer $\Phi (\frac{d'_i}{2} - c_i)$, but is corrected to

$$ \mathbb {E}(p_{\text {hit},r,i}) = \Phi (\frac{d'_i}{2} - c_i) - a \Phi (\frac{d'_i}{2} - c_i)^{N_s} + a \big \{1-\Phi (\frac{d'_i}{2} - c_i)\big \}^{N_s}. $$

Similarly, the mean of $p_{\text {false alarm},r,i}$ is no longer $1-\Phi (\frac{d'_i}{2} + c_i)$, but is corrected to

$$ \mathbb {E}(p_{\text {false alarm},r,i}) = 1-\Phi (\frac{d'_i}{2} + c_i) + a \Phi (\frac{d'_i}{2} + c_i)^{N_n} - a \big \{1-\Phi (\frac{d'_i}{2} + c_i)\big \}^{N_n}. $$

Therefore,

$$\begin{aligned} \begin{aligned} \mathbb {E}(\hat{d'}_{r,i}) =&z(\mathbb {E}(p_{\text {hit},r,i})) - z(\mathbb {E}(p_{\text {false alarm},r,i})) \\ =&z(\Phi (\frac{d'_i}{2} - c_i) - a \Phi (\frac{d'_i}{2} - c_i)^{N_s} + a \big \{1-\Phi (\frac{d'_i}{2} - c_i)\big \}^{N_s} \\&\!+\! z(\Phi (\frac{d'_i}{2} \!+\! c_i) \!-\! a \Phi (\frac{d'_i}{2} \!+\! c_i)^{N_n} \!+\! a \big \{1\!-\!\Phi (\frac{d'_i}{2} \!+\! c_i)\big \}^{N_n}). \end{aligned} \end{aligned}$$

(4)

As a result, the average of $\hat{d'_{r,1}}$, $\hat{d'_{r,2}}$, ..., $\hat{d'_{r,3}}$ does not converge to the true group mean $d'$ as $M \rightarrow \infty $: the mean is asymptotically biased. Instead, it converges to a different value different from the true mean $d'$, where the amount of bias is determined by both the correction method and other experimental parameters shown in Eq. 4. The same issue applies to $\hat{d'}_{ll}$. Examples of values that they converge to are shown in Fig. 5, demonstrating that these values can have a large difference from the true mean $d'$ in many cases. As a result, increasing the sample size would exacerbate distortion problems in $d'$ estimation caused by mathematical correction.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, Y., Daly, H.R., Pitt, M.A. et al. Assessing the distortions introduced when calculating d’: A simulation approach. Behav Res 56, 7728–7747 (2024). https://doi.org/10.3758/s13428-024-02447-8

Download citation

Accepted: 22 May 2024
Published: 03 July 2024
Issue Date: October 2024
DOI: https://doi.org/10.3758/s13428-024-02447-8

Assessing the distortions introduced when calculating d’: A simulation approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recovering the variance of d' from hit and false alarm statistics

Too much model, too little data: How a maximum-likelihood fit of a psychometric function may fail, and how to detect and avoid this

Interaction effects on common measures of sensitivity: choice of measure, type I error, and power

Supplementary material

Open Practices Statement

Change history

27 February 2025

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 295 KB)

Appendix 1: Why increasing the sample size does not reduce \(d'\) distortion

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Assessing the distortions introduced when calculating d’: A simulation approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recovering the variance of d' from hit and false alarm statistics

Too much model, too little data: How a maximum-likelihood fit of a psychometric function may fail, and how to detect and avoid this

Interaction effects on common measures of sensitivity: choice of measure, type I error, and power

Supplementary material

Open Practices Statement

Change history

27 February 2025

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 295 KB)

Appendix 1: Why increasing the sample size does not reduce \(d'\) distortion

Appendix 1: Why increasing the sample size does not reduce \(d'\) distortion

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now