Categories
Data Modelling

Just one number

So often, just one number is not only not enough, it’s positively misleading. We often see statistics quoted that, say, the average number of children per family is 1.8. First off, what sort of average? Mean, median or mode?  It makes a difference. But really, the problem is that a mean (or median or mode) gives us only very limited information. It doesn’t tell us what the data looks like overall: we get no idea of the shape of the distribution, or the range the data covers, or indeed anything other than this single point.

Many traditional actuarial calculations are the same. The net present value of a series of payments tells us nothing about the period of time over which the payments are due, or how variable their amount is — information which is very important in a wide range of circumstances.

Tim Harford has just written a good piece about how the same is true of government statistics, too. He points out that not only is gdp not good for all purposes (a statement that just about everybody agrees with), but that there are lots of other statistics that are good for some purposes but not others. There is no such thing as a single number that measures everything.

And why should there be? Life, the world and everything is variable and complex. There’s no reason to suppose that just one measurement will be able to sum it all up. We can think of the mean (or any other summary statistic) as a very simple model of the data. So simple that it’s abstracted nearly all the complexity away. The model, like any other model, may be useful for some purposes, but it’s never going to be the only possible, or only useful, model