There are many good materials on Bayes theorem. I am here to give a little bit detail.
The Bayesian theorem states that:
for hypothesis being true. for evidence.
The above equation can be visualized using a Venn diagram:
The red box is the events , outside is not or . The green boxes is the events , outside the green boxes are not . It is obvious that
One common application is medical test. Given a test for a disease is 90% for true positive, and 10% for false positive. And the chance for having the disease among the population is 0.1%. What is the probability that a positive test really means infection?
From the data, we have , than
which is only 0.9% that he is infected. Let’s do another test, and it is still positive, than
which is still 7.6% that he is infected.
Number of test positive | |
1 | 0.89% |
2 | 7.5% |
3 | 42.2% |
4 | 86.8% |
5 | 98.3% |
6 | 99.8% |
This table shows that, for a very rare disease, 90% true positive testing method is not sufficient.
In fact, for ,
when ,
The above plot is the curve for as a function of for various . The black arrows started with prior , and they show that each positive test with will iterate toward higher and higher .
We can see that, if , the test is basically useless as that more false positive than true positive.
A more tricky thing is, what if, the 1st test is positive, the 2nd test is negative, and 3rd test is positive? In this case, we have to evaluate
Using the Venn diagram,
Using ,
So that the curve is the diagonal “mirror”.
The above plot start with prior . First test gives positive, the , but the 2nd test is negative, that gives . Back to the prior.
In above discussion, we assumed that , which is sum of the probabilities of the true positive and false positive are unity. But that is not necessary true.
If we simplify with simple variables, we have
where represents the for , and represents the other way. And we can check that
So, the curve for is same as with a transformation.
Up to today, there are zero covid reported for consecutive 40 days in Hong Kong. What is the probability that the virus is still exist in Hong Kong?
Our prior for the covid is , 20% of people has covid at ay 0-th. Assuming the probability that people with covid will show symptopes is 80%, and 20% will not show any symptoms. i.e . An also, it it is very likely that people without covid shows covid symptopes, i.e. . And we also suppose that the covid testing had 70% true positive and 10% false positive.
The probability for reporting a covid case is the sum of tested positive with symptopes, plus false positive with fake symptopes.
, so, there is 56% chance that a true covid will be reported.
, there is 8% chance that a false covid will be reported.
so, , there is 17.6% chance that a covid will be reported in the population.
, thus, there is 11% chance that there is a covid but no case reported on the 1st day.
Here is the plot for the vs number of day of zero report case.
Hong Kong has 7.5 million people, at 20th days of zero case reported, there could be 1 people infected and hidden in the population. But at 40th days of zero case reporte, the chance to have covid is .
In the early days of the pandemic policy, the HK gov required 21 days with no case reported as a condition for relaxing the measure, which is reasonable. As in our rough estimation, there still could be 1 real case among the population.
In our rough estimation, we ignored the spreading of the covid. To include the covid spreading, we can multiple the R-factor to the everyday. Suppose the R-factor is 0.7, the actual covid case in the population for the 1st day of zero-case reported will be and so on. It turns out, needs more time to as small as . But
Given Hong Kong is one of the lowest infection rate among the world, the effective R-factor should be small or close to 0. R=0.7 is like without any protection.