Tuesday, 24 December 2013

The Most Important Equation in all of Science

The end of the year is rapidly approaching (just where did 2013 go). I've just spend the week in Canberra at a very interesting meeting, but on the way down I actually thought I was off in to space. Why? Because the new Qantas uniform looks like something from Star Trek.
Doesn't Qantas know what happens to red shirts?

Anyway - why was I in Canberra? I was attending MaxEnt 2013, which, to give it its full name, was the 33rd International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering. 

It was a very eclectic meeting, with astronomers, economists, quantum physicists, even people who work on optimizing train networks.  So, what was it that brought this range of people together? It was the most important equation in science. Which equation is that, you may ask? It's this one.
Huh, you may say. I've spoken about this equation multiple times in the past, and what it is, of course, is Bayes Rule.

Why is it so important? It allows you to update your prior belief, P(A), in light of new new data, B. So, in reality, what it does it connects the two key aspects of science, namely experiment and theory. Now, this is true not only in science done by people in white coats or at telescopes, it is actually how we live our day-to-day lives.

You might be shaking your head saying "I don't like equations, and I definitely haven't solved that one", but luckily it's kind of hardwired in the brain.

As an example, imagine that you were really interested in the height of movie star, Matt Damon (I watched Elysium last night). I know he's an adult male, and to I have an idea that he is likely to be taller than about 1.5m, but is probably less than around 2.5m. This is my prior information. But notice that it is not a single number, it is a range, so my prior contains an uncertainty.

Here, uncertainty has a very well defined meaning. It's the range of heights of Matt Damon that I think are roughly equally likely to be correct. I often wonder if using the word uncertainty confuddles the general public when it comes to talking about science (although it is better than "error" which is what is often used to describe the same thing). Even the most accurate measurements have an uncertainty, but the fact that the range of acceptable answers is narrow is what makes a measurement accurate.

In fact, I don't think that it is incorrect to say that it's not the measurement that gets you the Nobel Prize, but a measurement with a narrow enough uncertainty that allows you to make a decisive statement that does.

I decide to do an internet search, and I find quite quickly that he appears to be 1.78m. Sounds pretty definitive, doesn't it, but a quick look at the internet reveals that this appears to be an over estimate, and he might be closer to closer to 1.72m.  So my prior is now updated with this information. In light of this data, it's gone from having a width of a metre to about 10cms. What this means is that the accuracy of my estimate in the height of Matt Damon has improved.

Then, maybe, I am walking down the street with a tape measure, and who should I bump into is Mr Damon himself. With measure in hand, I can get some new, very accurate data, and update my beliefs again. This is how learning processes proceed.

But what does this have to do with science? Well, science proceeds in a very similar fashion. The part that people forget that there is not a group consciousness deciding on what our prior information, P(A), is. This is something inside the mind of each and every researcher.

Let's take a quick example - namely working out the value of Hubble's constant. The usual disclaimer applies here, namely, I am not a historian but I did live through and experience the last stages of the history.

Hubble's constant is a measure of the expansion of the Universe. It comes from Hubble's original work on the recession speed of galaxies, with Hubble discovering that the more distant a galaxy is, the fast it is moving away from us. Here's one of the original measures of this.
Across the bottom is distance, in parsecs, and up the side is velocity in km/s. So, where does the Hubble constant come in? It's just the relation between the distance and the velocity, such that

v = Ho D

The awake amongst you will have noted that this is just a linear fit to the data. The even more awake will note that the "constant" is not the normal constant of a linear fit, as this is zero (i.e. the line goes through the origin), but is, in fact, the slope.

So, Hubble's constant is a measure of the rate of expansion of the Universe, and so it is something that we want to know. But we can see there is quite a bit of scatter in the data, and a range straight lines fit the data. In fact, Hubble was way off in his fit as he was suffering from the perennial scientific pain, the systematic error as he didn't know he had the wrong calibration for the cephid variable stars he was looking at.

As I've mentioned before, measuring velocities in astronomy is relatively easy as we can use the Doppler effect. However, distances are difficult; there are whole books on the precarious nature of the cosmological distance ladder.

Anyway, people used different methods to measure the distances, and so got different values of the Hubble constant. Roughly put, some were getting values around 50 km/s/Mpc, whereas others were getting roughly twice this value. 

But isn't this rather strange? All of the data were public, so why were there two values floating around? Surely there should be one single value that everyone agreed on?

What was different was the various researchers prior, P(A). What do I mean by this? Well, for those that measured a value around 50 km/s/Mpc, further evidence that found a Hubble's constant about this value reinforced their view, but their priors strongly discounted evidence that it was closer to 100 km/s/Mpc. Whereas those who thought it was about 100 km/s/Mpc discounted the measurements of the researchers who found it to be 50 km/s/Mpc.

This is known as confirmation bias. Notice that it nothing to do with the data, but is in researchers' heads, it is their prior feeling of what they think the answer should be, and their opinion of other researchers and their approaches.

However, as we get more and more data, eventually we should have lots of good, high quality data, so that the data overwhelms the prior information in someone's head. And that's what happened to Hubble's constant. Lots of nice data.
And we find that Hubble's constant is about 72 km/s/Mpc - we could have saved a lot of angst if the 50 km/s/Mpc groups and the 100 km/s/Mpc group split the difference years ago.

It is very important to note that science is not about black-and-white, yes-and-no. It's about gathering evidence and working out which models best describe the data, and using the models to make more predictions. However, it is weighted by your prior expectations, and different people will require differing amounts of information before they change their word view.

This is why ideas often take a while to filter through a community, eventually becoming adopted as the leading explanation. 

On a final note, I think to be a good scientist, you have to be careful with your prior ideas and it's very naughty to set a prior on a particular idea to be precisely zero. If you do, then no matter how much evidence you gather, then you will never will be able to accept a new idea; I think we all know people like this, but in science there's a history of people clinging to their ideas, often to the grave, impervious to evidence that contradicts their favoured hypothesis. And again, this has nothing to do with the observations of the Universe, but what is going on in their heads (hey - scientists are people too!).

Anyway, with my non-zero priors, I will update my world-view based upon new observations of the workings of the Universe. And to do this I will use the most important equation in all of science, Bayes rule. And luckily, as part of the MaxEnt meeting, I can now proudly wear Bayes rule where ever I go. Go Bayes :)

Saturday, 7 December 2013

The large-scale structure of the halo of the Andromeda Galaxy Part I: global stellar density, morphology and metallicity properties

And now for a major paper from the Pan-Andromeda Archaeological Survey (PAndAS), led by astronomer-extraordinare Rodrigo Ibata. I've written a lot about PAndAS over the years (or maybe a year and a bit I've been blogging here) and we've discovered an awful lot, but one of the key things we wanted to do is measure the size and shape of the stellar halo of the Andromeda Galaxy.

The stellar halo is an interesting place. It's basically made up of the first generation of stars that formed in the dark matter halo in which the spiral galaxy of Andromeda was born, and the properties of the halo are a measure of the  formation history of the galaxy, something we can directly compare to our theoretical models.

But there is always a spanner in the works, and in this case it is the fact that Andromeda, like the Milky Way, is a cannibal and has been eating smaller galaxies. These little galaxies get ripped apart by the gravitational pull of Andromeda, and their stars litter the halo in streams and clumps. As we've seen before, Andromeda has a lot of this debris scattered all over the place.

So, we are left with a problem, namely how do we see the stellar halo, which is quite diffuse and faint, underneath the prominent substructure? This is where this paper comes in.

Well, the first thing is to collect the data, and that's where PAndAS comes in. The below picture confirms just how big the PAndAS survey is, and just how long it took us to get data.
It always amazes me how small the spiral disk of Andromeda is compared to the area we surveyed, but that's what we need to see the stellar halo which should be a couple of hundred kiloparsecs in extent.

Taking the data is only the first step. The next step, the calibration of the data, was, well, painful. I won't go into the detail here, but if you are going to look for faint things, you really need to understand your data at the limit, to understand what's a star, what's a galaxy, what's just noise. There are lots of things you need to consider to make sure the data is nice, uniform and calibrated. But that's what we did :)

Once you've done that, we can ask where the stars are. And here they are;
As you can see, chunks and bumps everywhere, all the dinner leftovers of the cannibal Andromeda. And all of that stuff is in the way of finding the halo!

What do we do? We have to mask out the substructure and search for the underlaying halo. We are in luck, however, as we don't have one map of substructure, we have a few of them. Why? Well, I've written about this before, but the stars in the substructure come from different sized objects, and so them chemical history will be different; in little systems, the heavy elements produced in supernova explosions are not held by their gravitational pull, and so they can be relatively "metal poor", but in larger systems the gas can't escape and gets mixed into the next generation of stars, making them more

So, here's our masks as a function of the iron abundance compared to hydrogen.
We see that the giant stream is more metal rich, but as we go to metal poor we see the more extensive substructure, including the South West Cloud.

What do we find? Well, we see the halo (horrah!) and it does what it should - it is brightest near the main body of Andromeda, but gets fainter and fainter towards the edge. Here's a picture of the profile:
It's hard to explain just how faint the halo is, but it is big, basically stretching out to the edge of our PAndAS data, and then beyond, and looks like it accounts for roughly 10% of the stellar mass in Andromeda. It is not inconsequential!

But as we started out noting, its properties provide important clues to the very process of galaxy formation. And it appears that it looks like we would expect from our models of structure formation, with large galaxies being built over time through the accretion of smaller systems.

We're working on a few new tests of the halo, and should hopefully have some more results soon. But for now,  well done Rod!

The large-scale structure of the halo of the Andromeda Galaxy Part I: global stellar density, morphology and metallicity properties

We present an analysis of the large-scale structure of the halo of the Andromeda galaxy, based on the Pan-Andromeda Archeological Survey (PAndAS), currently the most complete map of resolved stellar populations in any galactic halo. Despite copious substructure, the global halo populations follow closely power law profiles that become steeper with increasing metallicity. We divide the sample into stream-like populations and a smooth halo component. Fitting a three-dimensional halo model reveals that the most metal-poor populations ([Fe/H]<-1.7) are distributed approximately spherically (slightly prolate with ellipticity c/a=1.09+/-0.03), with only a relatively small fraction (42%) residing in discernible stream-like structures. The sphericity of the ancient smooth component strongly hints that the dark matter halo is also approximately spherical. More metal-rich populations contain higher fractions of stars in streams (86% for [Fe/H]>-0.6). The space density of the smooth metal-poor component has a global power-law slope of -3.08+/-0.07, and a non-parametric fit shows that the slope remains nearly constant from 30kpc to 300kpc. The total stellar mass in the halo at distances beyond 2 degrees is 1.1x10^10 Solar masses, while that of the smooth component is 3x10^9 Solar masses. Extrapolating into the inner galaxy, the total stellar mass of the smooth halo is plausibly 8x10^9 Solar masses. We detect a substantial metallicity gradient, which declines from [Fe/H]=-0.7 at R=30kpc to [Fe/H]=-1.5 at R=150kpc for the full sample, with the smooth halo being 0.2dex more metal poor than the full sample at each radius. While qualitatively in-line with expectations from cosmological simulations, these observations are of great importance as they provide a prototype template that such simulations must now be able to reproduce in quantitative detail.