Science and statistics

*round of applause*

You clicked to read the article with the most off-putting title 😉

Let’s face it, there aren’t many people who are fans of statistics, or maths. Maybe science, but perhaps more the ‘Neil Degrasse Tyson’ type of science, not the ‘lets read some scientific papers’ type.

I have a scientific mind; I have my father to thank for that. I’ve never worked as a scientist but I have a few bits of paper from various universities, the highest being an MPhil.

If you’ve done some research on hysterectomies you’ve probably come across some statistics. Or some study results. Or some survey results. (Let’s face it you can read up on just about any subject these days and find statistics).

There is a famous saying: “There are three types of lies — lies, damn lies, and statistics.” (It was Mark Twain who said that). Numbers make information seem factual and truthful, when in reality statistics can be presented in different ways to appear to show different things.

Let me give an example. Imagine there is a general risk of contracting a particular form of cancer. And then let’s say that by eating a certain food you increase that risk by 50%. That sounds really bad, doesn’t it? It sounds like you absolutely should not eat that food. But this bit of information is basically meaningless unless you know what the risk level was to start with.
If the risk increases by 50% that doesn’t mean the risk *becomes* 50%…..
These sentences are very different:
50% of people who eat food X will get cancer.
People who eat food X are 50% more likely to get cancer.

Example 1
Let’s say the risk was 25% (I’m not sure there’s any cancer where the general risk is that high but this is just an example!). So if the risk increases by 50%, that means it goes up by 50% of 25%, which is 12.5%. So the risk now is 37.5%. That is quite a big difference.
Example 2
But more commonly the risk level is low. Let’s say the risk was 0.3%. So if you choose to eat the food your risk increases by 50%. That means 50% of 0.3%, which is a massive 0.15%. So the risk is now 0.45%, instead of 0.3%. The increase sounds huge (50%) but actually it’s made very little difference to the actual chance of getting that particular cancer.

Do you see what I mean? Numbers can be presented in a way to make something sound as if it has a major effect, when in reality it does not. From a *scientific* point of view it may be substantial, but if you are talking about making decisions about your life it barely moves the needle.

Here’s another way to express the same numbers:
Example 1 – general risk = 25%. This is the same as 2500 out of 10000 people.
Increased risk = 37.5%. This is the same as 3750 out of 10000 people.

Example 2 – general risk = 0.3%. This is the same as 30 out of 10000 people.
Increased risk = 0.45%. This is the same as 45 out of 10000 people.

Or if you like diagrams:

So if you are weighing up decisions based on numbers like this, or even just reading an article which makes a claim like this, dig a little deeper. Don’t assume that a 50% increase is huge, because if there isn’t much there to begin with 50% of not very much is still not very much!
(cartoon)

Another example. Let’s use a non-medical example… How about a candidate for mayor. In the election this year, he got 75% more votes than he did in the last election! He needs to get over 6000 votes to win the election – did he manage it?
Wow that sounds like he did really well. He probably won. Or did he? That depends on how many votes he got before.

Example 1:
He got 4500 votes last time. This time he got 75% more; 75% of 4500 is 3375.
So he got 7875 votes this time. A lot more! Enough to win the election!
Example 2:
He got 8 votes last time. This time he got 75% more; 75% of 8 is 6.
So he got 14 votes this time. A big increase wasn’t enough to make any difference at all…

 

Hopefully these two examples have shown you how statistics can be misleading, and why it’s important not to take numbers like this at face value. If you’d like to read some more on this topic I’ve listed some useful links below, including a TED talk:
https://www.datapine.com/blog/misleading-statistics-and-data/
http://www.truthpizza.org/logic/stats.htm
https://abcnews.go.com/Technology/story?id=6034371&page=1
TED talk – https://ed.ted.com/lessons/how-statistics-can-be-misleading-mark-liddell
http://thehealthcareblog.com/blog/2010/05/12/statistics-%E2%80%93-using-the-truth-to-mislead/
https://healthjournalism.org/blog/2009/12/duo-writes-about-how-health-statistics-can-mislead/