Monday, April 21, 2014

HIV testing - cond probability illustrated

A common cond probability puzzle -- Suppose there's a test for HIV (or another virus). If you carry the virus, there's a 99% chance the test will correctly identify it, with 1% chance of false negative (FN). If you aren't a carrier, there's a 95% chance the test will come up clear, with a 5% chance of false positive (FP). To my horror my result comes back positive. Many would immediately assume there a 99% chance I'm infected. The intuition is, like in many probability puzzles, incorrect.

In short Pr(IsCarrier|Positive result) depends on the prevalence of HIV.

Suppose out of 100million people, the prevalence of HIV is X (a number between 0 and 1). This X is related to what I call the "pool distribution", a fixed, fundamental property of the population, to be estimated.

P(TP) = P(True Positive) = .99X
P(FN) = .01X
P(TN) = P(True Negative) = .95(1-X)
P(FP) = .05(1-X)

The 4 probabilities above add up to 100%. A positive result is either a TP or FP. I feel a key question is "Which is more likely -- TP or FP". This is classic conditional probability.

Denote C==IsCarrier. What's p(C|P)? The "flip" formula says

p(C|P) p(P) = p(C.P) = p(P|C) p(C)
p(P) is simply p(FP) + p(TP)
p(C) is simply X
p(P|C) is simply 99%
Actually, p(C.P) is simply p(TP)

The notations are non-intuitive. I feel a more intuitive perspective is "Does TruePositive dominate FalsePositive or vice versa?" As explained in [[HowToBuildABrain]], if X is very low, then FalsePositive dominates TruePositive, so most of the positive results are false positives.

No comments:

Total Pageviews

my favorite topics (labels)

_fuxi (302) _misLabel (13) _orig? (3) _rm (2) _vague (2) clarified (58) cpp (39) cpp_const (22) cpp_real (76) cpp/java/c# (101) cppBig4 (54) cppSmartPtr (35) cppSTL (33) cppSTL_itr (27) cppSTL_real (26) cppTemplate (28) creditMkt (14) db (65) db_sybase (43) deepUnder (31) dotnet (20) ECN (27) econ/bank` (36) fin/sys_misc (43) finGreek (34) finReal (45) finRisk (30) finTechDesign (46) finTechMisc (32) finVol (66) FixedIncom (28) fMath (7) fMathOption (33) fMathStoch (67) forex (39) gr8IV_Q (46) GTD_skill (15) GUI_event (30) inMemDB (42) intuit_math (41) intuitFinance (57) javaMisc (68) javaServerSide (13) lambda/delegate (22) marketData (28) math (10) mathStat (55) memIssue (8) memMgmt (66) metaProgram` (6) OO_Design (84) original_content (749) polymorphic/vptr (40) productive (21) ptr/ref (48) py (28) reflect (8) script`/unix (82) socket/stream (39) subquery/join (30) subvert (13) swing/wpf (9) sysProgram` (16) thread (164) thread_CAS (15) thread_cpp (28) Thread* (22) timeSaver (80) transactional (23) tune (24) tuneDB (40) tuneLatency (30) z_ajax (9) z_algoDataStruct (41) z_arch (26) z_arch_job (27) z_automateTest (17) z_autoTrad` (19) z_bestPractice (39) z_bold (83) z_bondMath (35) z_book (18) z_boost (19) z_byRef^Val (32) z_c#GUI (43) z_c#misc (80) z_cast/convert (28) z_container (67) z_cStr/arr (39) z_Favorite* (8) z_FIX (15) z_forex (48) z_fwd_Deal (18) z_gz=job (33) z_gzBig20 (13) z_gzMgr (13) z_gzPain (20) z_gzThreat (19) z_hib (19) z_IDE (52) z_ikm (5) z_IR_misc (36) z_IRS (26) z_javaWeb (28) z_jdbc (10) z_jobFinTech (46) z_jobHunt (20) z_jobRealXp (10) z_jobStrength (15) z_jobUS^asia (27) z_letter (42) z_linq (10) z_memberHid` (11) z_MOM (54) z_nestedClass (5) z_oq (24) z_PCP (12) z_pearl (1) z_php (20) z_prodSupport (7) z_py (31) z_quant (14) z_regex (8) z_rv (38) z_skillist (48) z_slic`Problem (6) z_SOA (14) z_spring (25) z_src_code (8) z_swingMisc (50) z_swingTable (26) z_unpublish (2) z_VBA/Excel (8) z_windoz (17) z_wpfCommand (9)

About Me

New York (Time Square), NY, United States
http://www.linkedin.com/in/tanbin