# Latest content was relocated to https://bintanvictor.wordpress.com. This old blog will be shutdown soon.

## Monday, April 21, 2014

### HIV testing - cond probability illustrated

A common cond probability puzzle -- Suppose there's a test for HIV (or another virus). If you carry the virus, there's a 99% chance the test will correctly identify it, with 1% chance of false negative (FN). If you aren't a carrier, there's a 95% chance the test will come up clear, with a 5% chance of false positive (FP). To my horror my result comes back positive. Many would immediately assume there a 99% chance I'm infected. The intuition is, like in many probability puzzles, incorrect.

In short Pr(IsCarrier|Positive result) depends on the prevalence of HIV.

Suppose out of 100million people, the prevalence of HIV is X (a number between 0 and 1). This X is related to what I call the "pool distribution", a fixed, fundamental property of the population, to be estimated.

P(TP) = P(True Positive) = .99X
P(FN) = .01X
P(TN) = P(True Negative) = .95(1-X)
P(FP) = .05(1-X)

The 4 probabilities above add up to 100%. A positive result is either a TP or FP. I feel a key question is "Which is more likely -- TP or FP". This is classic conditional probability.

Denote C==IsCarrier. What's p(C|P)? The "flip" formula says

p(C|P) p(P) = p(C.P) = p(P|C) p(C)
p(P) is simply p(FP) + p(TP)
p(C) is simply X
p(P|C) is simply 99%
Actually, p(C.P) is simply p(TP)

The notations are non-intuitive. I feel a more intuitive perspective is "Does TruePositive dominate FalsePositive or vice versa?" As explained in [[HowToBuildABrain]], if X is very low, then FalsePositive dominates TruePositive, so most of the positive results are false positives.