One good thing about living in Sydney, which I've noted before, is that it easy to get to see international rugby at the Olympic Park. An excellent free bus service is provided to bring people in from the far-reaches of Sydney, and then take them home again.
So, I've been thinking - If people turn up at a bus stop at a certain rate, and buses arrive at a certain rate, then what do we expect the number of people on each bus to be?
Being a physicist, we start by simplifying the problem. Let's assume that the average number of people who turn-up per hour is λ1, and the average number of buses per hour that turn up is λ2. Let's also assume that the bus instantaneously picks up all the passengers who are waiting, and then heads off.
Imagine you are on the gate of the park, watching the buses leave. What's the distribution for the number of passengers on each bus?
The problem requires two parts, both based on the Poisson distribution. I mentioned this in the recent discussion on the German tank problem, but it is an extremely powerful description of random events, from the drops coming from a tap, the number of photons arriving at a detector, and, quite famously, the number of men kicked to death by horses in the Prussian cavalry.
The other side of the coin is that you can use this distribution to calculate the time between events, which is what we want to use to describe the time between buses arriving. It turns out that this is an exponential distribution, and the probability of a bus arriving between t and t+dt after the last one is
This means that there are lots of short gaps, and fewer long gaps between buses arriving. Now, real buses don't follow such a distribution in detail, but let's stick with this because the maths gets more funnerer.
Right, the next question is how many people accumulate at the bus stop between buses? It's over to the Poisson distribution again. In a time, t, the distribution for the number of people at the bus stop is given by
Why is there a distribution? Well, people are dribbling in at random and in the same time interval there might be one person, or two, or ten or even none.
OK, we cans stick the two equations together and ask the question "what is the distribution of number of people leaving on the buses?" What we end up with is an expression that looks like this.
Why is there an integral in there? Well, multiplying the two probabilities gives us the distribution of the number of people on a bus, after a waiting time between t and t+dt. A little algebra, this becomes
What is that? The final probability distribution depends upon a rather mysterious quantity called the Gamma function. I could write a long post on the gamma function, but seeing that we are only interested in integer values (because we can only have whole numbers of passengers on the bus), then we know that
and so the distribution for the number of passengers on the bus becomes
The distribution of the number of passengers becomes
In fact, we have a pretty interesting progression here, and it has been known from antiquity that 1/2+1/4+1/8+.... = 1, which is what you want all of the probabilities to add up to.
Let's consider a more realistic scenario, with 1000 people per hour arriving at the bus stop, and 20 buses per hour. What does the distribution of bus occupancy look like? You might think that it is most likely that there are about 50 people on each bus. Here's the distribution
You might think that chucking more buses per hour is a solution to the problem, but we always have this shape. The most likely number of people on a bus is zero. Throwing less buses doesn't help either, the most probable number of passengers is zero. Luckily this is not how real buses at Sydney Olympic Park operate, or we would never get home!
One last question then, as I am cooking pizzas tonight and need to get started. Let's imagine that you are not the gate guards, but a passenger, just a random passenger. What's the most likely number of passengers on the bus (including you)?
While the most probable number of passengers on a bus is zero, there is no one there to see that. And while there are lots of buses with 1 person on, it is actually unlikely to be you (I know this might sound a little paradoxical, but think about it).
So, we effectively have to weight each number of passengers probability but the number of passengers there to see it. What does that do?
And while you are chewing that over, I'm off to do some cooking.