Next: Statistical models of language
Up: Events and probabilities
Previous: Bayes rule
Imagine that a doctor has to deal
with a patient who presents with sneezing (call this event S).
The underlying disease might be pneumonic plague (P, and dangerous) or
a cold (C, and not a major worry).
Obviously, the doctor needs to form an opinion about what is likely
to be going on. In our terms he needs to estimate both
P(P|S)
and
P(C|S). These are the probabilities of the diseases given the
symptoms. The answer is not intuitively obvious.
It obviously isn't enough to know the probabilities
P(S|C)
and
P(S|P)
which are the probabilities of sneezing if you have the relevant
diseases (most doctors would assume P(S|C)=1.0 and P(S|P)=1.0
; you are pretty well certain to sneeze if you have either of the
diseases).
Fortunately, it is general knowledge among doctors that, all other
things being equal, the common cold is more common than pneumonic
plague. In Scotland you can assume that P(C) = 0.25 (at least a
quarter of the patients visiting the doctor have a cold) and P(P) =
10-6 (about 1 in a million visitors to a doctor's surgery
have plague). It might also be reasonable to assume (from experience)
that P(S) = 0.35, about
1 in 3 of the patients are sneezing at a given time (this is called
the prior on sneezing). Putting all
this together, you get

and

That is, sneezing patients are much more likely to be cold victims
than harbingers of doom. Colds are 250,000 times more likely than
plague.
Note the following:
- The dominant factor in the calculation is the assessment of the
base rates of the two illnesses (i.e. P(P) and P(S).
Small changes in the other factors do not affect the conclusion
much, since
Understanding the derivation allows us to see
which changes matter.
- It would be bad if the
doctor had simply learnt in medical school that the P(P|S) is
low and P(C|S) is high, without deriving it from more robust
prior information. If P(P) changes markedly,
as would be the case in an epidemic of plague,the doctor might
not realize that this will have a proportionate effect on
P(P|S). A colleague who understands Bayes' rule would be much
better placed.
- The causal information P(S|C) and P(S|P) is unaffected by the
prevalence of either disease. Medical schools can and do provide
this sort of reliable knowledge to doctors.
What happens if the common cold is eradicated?
Notice that we assumed 5 in 100 patients were sneezing for reasons
which were neither colds nor plague.
The details change, since P(S)
drops to 0.05. The effect on the estimate of the probability
of plague is to increase it sixfold to
. Whether
this matters much depends on what the consequences of missing a patient
with pneumonic plague actually are.
Next: Statistical models of language
Up: Events and probabilities
Previous: Bayes rule
Chris Brew
8/7/1998