Interesting Software

Reinhart and Rogoff: was Excel the problem?

There’s a bit of a furore going on at the moment: it turns out that a controversial paper in the debate about the after-effects of the financial crisis had some peculiarities in its data analysis.

Rortybomb has a great description, and the FT’s Alphaville and Tyler Cowen have interesting comments.

In summary, back in 2010 Carmen Reinhart and Kenneth Rogoff published a paper Growth in a time of debt in which they claim that “median growth rates for countries with public debt over 90 percent of GDP are roughly one percent lower than otherwise; average (mean) growth rates are several percent lower.” Reinhart and Rogoff didn’t release the data they used for their analysis. Since then, apparently, people have tried and failed to reproduce the analysis that gave this result.

Now, a paper has been released that does reproduce the result: Herndon, Ash and Pollin’s Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff,

Except that it doesn’t, really. Herndon, Ash and Pollin identify three issues with Reinhart and Rogoff’s analysis, which mean that the result is not quite what it seems at first glance. It’s all to do with the weighted average that R&R use for the growth rates.

First, there are data sets for 20 countries covering the period 1946-2009. R&R exclude data for three countries for the first few years. It turns out that those three countries had high debt levels and solid growth in the omitted periods. R&R didn’t explain these exclusions.

Second, the weights for the averaging aren’t straightforward (or, possibly, they are too straightforward). Rortybomb has a good explanation:

Reinhart-Rogoff divides country years into debt-to-GDP buckets. They then take the average real growth for each country within the buckets. So the growth rate of the 19 years that the U.K. is above 90 percent debt-to-GDP are averaged into one number. These country numbers are then averaged, equally by country, to calculate the average real GDP growth weight.

In case that didn’t make sense, let’s look at an example. The U.K. has 19 years (1946-1964) above 90 percent debt-to-GDP with an average 2.4 percent growth rate. New Zealand has one year in their sample above 90 percent debt-to-GDP with a growth rate of -7.6. These two numbers, 2.4 and -7.6 percent, are given equal weight in the final calculation, as they average the countries equally. Even though there are 19 times as many data points for the U.K.

Third, there was an Excel error in the averaging. A formula omits five rows. Again, Rortybomb has a good picture:



So, in summary, the weighted average omits some years, some countries, and isn’t weighted in the expected way. It doesn’t seem to me that any one of these is the odd man out, and I don’t think it really matters why either of the omissions occurred: in other words, I don’t think this is a major story about an Excel error.

I do think, though, that it’s an excellent example of something I’ve been worried about for some time: should you believe claims in published papers, when the claims are based on data analysis or modelling?

Let’s consider another, hypothetical, example. Someone’s modelled, say, the effects of differing capital levels on bank solvency in a financial crisis. There’s a beautifully argued paper, full of elaborate equations specifying interactions between this, that and the other. Everyone agrees that the equations are the bee’s knees, and appear to make sense. The paper presents results from running a model based on the equations. How do you know whether the model does actually implement all the spiffy equations correctly? By the way, I don’t think it makes any difference whether or not the papers are peer reviewed. It’s not my experience that peer reviewers check the code.

In most cases, you just can’t tell, and have to take the results on trust. This worries me. Excel errors are notorious. And there’s no reason to think that other models are error-free, either. I’m always finding bugs in people’s programs.

Transparency is really the only solution. Data should be made available, as should the source code of any models used. It’s not the full answer, of course, as there’s then the question of whether anyone has bothered to check the transparently provided information. And, if they have, what they can do to disseminate the results. Obviously for an influential paper like the R&R paper, any confirmation that the results are reproducible or otherwise is likely to be published itself, and enough people will be interested that the outcome will become widely known. But there’s no generally applicable way of doing it.

Old site

Excel as a swiss army knife

I’ve recently been involved in two surveys looking at what software actuaries use, and how they use it.

The first was organised by the GIRO working party on software use in general insurance, which I chaired. We got over 700 responses to our online survey, which was great. As we describe in the report, Excel is by far the most popular software, used by about 90% of the respondents.

The second survey was one that I ran myself, looking at software use in ICAs. There were 45 responses, from both life and non-life actuaries. Again, Excel was overwhelmingly popular: only one respondent doesn’t use it for preparing ICAs.

One theme that emerged from both surveys is the lack of informed choice. I really get the impression that many people use Excel just because it’s there, and because they can make it do what they need. However, it often takes a surprising amount of hard work. It’s a bit like using a swiss army knife to build a house; it probably can be done, but it’s not what you’d choose.

This was really brought home to me at the workshop we presented at the GIRO conference last week, on the results of the survey conducted by the working party. Presumably the people there were somewhat interested in software issues, and might be thought to be more informed than the run of the mill actuary. However, there were at least a couple who were very surprised when we said that we were disappointed at how few people used special purpose statistical software, given how poor the standard statistical functionality is in Excel.

Well, it’s definitely poor if you are operating in the tails of distributions. Microsoft admits that there are shortcomings in some of its algorithms, but basically says that it doesn’t matter much as everything is OK as long as you are not in the tails. But that’s exactly where most actuaries are operating. There are several useful descriptions of what the problems are. One that is especially annoying in some organisations is that the functions were changed in Excel 2003. If there are several different versions of Excel in the organisation, the results calculated by a spreadsheet can vary depending on who edited the spreadsheet most recently.

The random number generator in Excel isn’t brilliant, either. A new random number generator was implemented in Excel 2003, but in the first release some of the numbers it generated were negative, instead of in the [0, 1] range as advertised (this has since been fixed in a patch). The generator that was used in earlier versions of Excel was rather less random. If you are doing serious stochastic work, you should make sure that the generator you are using is sufficiently random. There are a number of add-ins available that provide more sophisticated algorithms than that used in the standard Excel one.