Saturday, 8 August 2015

Is an elephant heavier than a mouse?

Wow. It's been a while since I have written a blog post. Much of this is because of work and travels, and book writing (more news on that in the near future). But I'd like to get into blog writing, so here's a little science musing.

Is an elephant heavier than a mouse?

Now, you are probably saying "well, of course". Surely event the heaviest mouse weighs less than the newest born baby elephant, so why am I asking such a stupid question.

Well, because science can never really prove that an elephant is heavier than a mouse.

I know, I've gone and put that word in, and I've written about how proof has no place in science. But let's examine this in a little more detail.

I've stressed many times before that while measurements are important in science, without an uncertainty such measurements are useless. And while professional scientists pour over papers focused on the error bars in figures, errors and uncertainties are typically waived over in undergraduate degree. Luckily, this is changing, and statistical understanding is weaving its way through courses (But still not enough in my humble opinion).

For the simple example here, we'll consider Gaussian errors. I've just grabbed this piccy off the web as it explains the situation nicely

 The top figure is the important one and shows us the characteristic "bell-shaped" curve you get with Gaussian errors. The curve represents what a scientist would see as a measurement; the peak of the curve gives us the best estimate for the measurement. But no measurement is perfect and measuring devices have limitations and noise is introduced, and any measured value will be somewhat off from "reality" (and let's not open that can of worms). The distribution shows our belief in where the true measurement lies, most probably at the peak, but a good chance of it sitting in the body of the bell, and almost certainly within the entire range shown in the figure.

For the interested, there are plenty of tables of the values of these normal distributions.

OK, back to our elephant and mouse. Let's suppose we measure the elephant by popping it on some scales, finding it to be 5000 kg. Every measurement has an uncertainty, and the scales are quite accurate, and so the width of the Gaussian is 1 kg.

We have to use a different scale for the mouse, but find it is 500 g, but an uncertainty (the width of the Gaussian) is 10 g.

Job done you think. 5000 kg is more than 500 g, so the elephant wins!

Not so fast! The uncertainties matter!!! While the Gaussian drops away from its peak value, getting smaller and smaller, it does not go to zero. This means that while we are confident that the mouse is somewhere between 495 g and 505 g, there is a small chance that it is actually 510 g, and a smaller chance that it is 600 g, and extremely small chance that it is actually 1 kg, and an absolutely minuscule chance that it is 10000 kg.

And we can play the same game with the elephant. And while we are happy its weight is around 5000 kg, there is an absolutely minuscule chance that it is actually 10 g.

Put all together, this means that we have an extreme amount of confidence that the elephant is heavier than the mouse, there is this tiny possibility from our measurements that the mouse is actually heavier than the elephant!!

Now, I know that some of my colleagues will complain about this, as distributions in the real world will not necessarily be Gaussian etc, but that's secondary for the point I am trying to get across.

Also, some will say things like once chances get below some certain level they may as well treat things as certainty, and while this is true, it is important to remember that the choices of where the dividing line is is rather arbitrary (i.e the choice of n-sigma or p-values etc). There is nothing magical about these values!

 So, what's the point of all of that? Well, clearly in the question of the elephant and the mouse, the chances of the elephant being lighter than the mouse is so ridiculously small that you can be pretty certain that the elephant is heavier.

But science is rarely about comparing a mouse's mass and an elephant's mass, but is often about making measurements at the limits of equipments' abilities. And the question of the how significant the result is becomes of critical importance.

When people claimed to have found the Higgs Boson, there was a lot of discussion around the statistics, with many struggling to explain why they thought the detection was significant (and some performed particularly badly).

But such discussions are not typically found in the medias' discussion of science findings, such as today's pears cure hangovers. And really it means that these stories are basically worthless as you cannot assess how robust the result is (oh, and the pear result is preliminary which normally means that the statistics are poor and the result could be a fluke and is likely to vanish with more data).

And all science then gets lumped into a single basket, and people view robust science, such as climate science, as being similar to statistically flakey measurements, such as red wine being good one day and bad the next.

If every journalist simply asked for an estimate of the statistical significance of a particular day's scientific press release, I think that many would not see the light of day and the overall reporting of science would undoubtedly be improved.

To be truly scientifically literate, you must be statistically literate. It's important to remember that.