A bad graph for medical screening

prevalence
sensitivity
specificity
prior
posterior
Author

Daniel Kaplan

Published

April 30, 2023

In Lesson 35, in the context of medical screening tests, we presented diagrams like this one.

This diagram is based on only three basic numbers—sensitivity, specificity, and prevalence. Exactly the same information could be presented in a 2x2 table:

Test result Sick patients Healthy patients
\(\mathbb P\) 12% (true positives) 26% (false positives)
\(\mathbb N\) 3% (false negatives) 60% (true negatives)

The four numbers necessarily add up to 100%, so one of the numbers is redundant. To generate the table we only need the three basic numbers:

Usually in statistical graphics, we place the scales on the horizontal and vertical axes, which is not the case with the above diagram. Sticking with the scales-on-axes convention, here is a streamlined graph:

We’ve generalized the notation a bit and emphasized (1-specificity) rather than the specificity itself.

  1. Prior(Alternative) = width of “Alternative” box.
  2. Likelihood for Alternative hypothesis, that is, p(\(\mathbb P\) | Alternative) (corresponds to sensitivity)
  3. Likelihood for Null hypothesis, p(\(\mathbb P\) | Null) (corresponds to 1-specificity.)

The area of each box is, as expected, the width times the height. The two areas printed on the graph are the ingredients for the calculation of the posterior:

posterior(Alternative | \(\mathbb P\)) = 0.12/(0.12 + 0.255) = 32%