Answer to Riddle #66: Disease Testing Accuracy.

66. A disease called The Phage spreading throughout a kingdom of 1,000,000 people. Currently one in every 500 people have the disease.
The King's scientists develop a test with accuracy as follows: It will fail by false negative at a 1% rate. It will fail by false positive at a 2% rate. The King, planning to test everyone in the kingdom, is pleased. He thinks if he was prepared to secretly murder the 2,000 infected people a 2% rise wont make any difference. 'Hold on' says the chief nerd 'you'll actually have to murder X people and only one in Y of them will have The Phage.'

What are X & Y?

I went out looking for puzzles, came back with this. It's a real world problem affecting epidemiologists, and you'll see why:

Instinctively this feels like X might be 2,020 or 2,040, within one or two percent of the 1 in 500 of a million people. We've seen before how misleading percentages can be such as in the watermelon puzzle.

False Negative & False Positive

This terminology is used in all sorts of things that give only two possible outcomes, spam filters, virus tests (computer,) and what we are doing here.

False Positive simply is when someone is given a positive result that shouldn't i.e. the person does not have the virus
False Negative conversely is when someone is given a negative result when they shouldn't i.e. the person has the disease.

Thus, perhaps not completely obviously, True Positive rate is 1 - (False Negative rate) and True Negative rate is 1 - (False Positive rate)... Consider if you are testing 100 people that have the disease, there can be no false positives but you can still get some false negatives. In our case the false negative rate is 1% or 0.01, so when testing 100 infected people we would get results indicating 1 negative and 99 positives. So the True Positive rate is 99%. Another way of phrasing the question, an alternative nomenclature, would be that the test is 99% accurate for positive results and 98% accurate for negative results.

People often concern themselves with false negatives, for example how much spam gets through your spam filter. But really the damage is done by the false positive. Real mail being blocked.

Back to the Kingdom

We'll work with 1,000,000 people, it's slightly easier than working with the percentages all the time. Lets calculate the number of people that will be found positive. Firstly the True Positives:
population * infected rate * (1 - FNr)
1,000,000 / 500 * 0.99 = 1,980

Now the False Positives
population * (1-infected rate) * FPr
1,000,000 * 499 / 500 * .02 = 19,960

We can see what is happening the True Negative rate might be 98% is high but the remaining 2% is acting on the huge number of people who are not infected. X is 1,980 + 19,960 = 21,940
Y is 21,940 / 1,980 = 11 ish, 11.08080808...

Calculator

Note: no information is sent to me, the calculation is done entirely locally, on your computer.
FNr= FPr= population= Prevalence 1 in
TNr=.98 TPr=.99
Actual infected =2000
True Positives = 1980 False Positives = 19960 Total Positives X= 21940 Y= 11.08

Conclusion

This is a real problem in real life. Think of the Ebola outbreak in 2015. Experts have to take serious precautions against this. For example only testing people with some symptoms, or people who have been exposed.

Both AI's got this wrong. Differently but both wrong nonetheless

If you're curious what Bard made of this puzzle...

If you're curious what ChatGPT made of this puzzle...

© Nigel Coldwell 2004 -  – The questions on this site may be reproduced without further permission, I do not claim copyright over them. The answers are mine and may not be reproduced without my expressed prior consent. Please inquire using the link at the top of the page. Secure version of this page.

PayPal
I always think it's arrogant to add a donate button, but it has been requested. If I help you get a job though, you could buy me a pint! - nigel