Categories
Notes Old site

Types of user

A spreadsheet may have the following types of user:

  • A Viewer looks at the results, but makes no changes. A Viewer may never see the actual spreadsheet, but only printouts of selected parts, but they are still using the spreadsheet.
  • A Player changes input cell values, but doesn’t change formulae and layouts. They may execute macros.
  • A Changer changes formulae and makes minor layout and formatting changes. They correct and enhance the spreadsheet, without making major changes.
  • A Developer makes major changes, writes VBA code, designs and implements the spreadsheet from scratch.
  • An Auditor may change input values. They must understand the working of the spreadsheet and be able to trace the inputs and outputs.

A Reviewer may share the needs of any of the other types of users, depending on what aspects of the spreadsheet they are reviewing. A Tester is essentially like a Player: their role is to check the operation and results of the spreadsheet, without making changes to its operation. For simple spreadsheets there may be no effective difference between many of these roles.

The same person may use the spreadsheet in several different ways during its lifetime. Users are characterized by the role they are playing at the time, rather than by what role they may be capable of playing.

Resources

The following external links are relevant:

Categories
Notes Old site

What is a bug?

In computing parlance, unlike normal life, bugs and viruses have nothing to do with each other. A bug is simply a fault, or error, while a virus is a malicious program that propagates from computer to computer by hiding itself inside another program or document.

Legend has it that the term bug was invented by Grace Murray Hopper, a Rear Admiral in the US Navy, who was one of the pioneers of computing. Early computers were huge machines made of relays and valves and wires and so on; compared to today’s sleek laptops or PDAs they were veritable Heath Robinson contraptions. Anyway, they were open to the atmosphere. Hopper tells the story:

Things were going badly; there was something wrong in one of the circuits of the long glass-enclosed computer. Finally, someone located the trouble spot and, using ordinary tweezers, removed the problem, a two-inch moth. From then on, when anything went wrong with a computer, we said it had bugs in it.

Hopper’s team introduced a new term into computing jargon when they said that they had debugged the machine. However, contrary to popular legend, the term bug had been in use since 1878, or even earlier, when Edison used it to refer to a flaw in a system.

Resources

The following external links are relevant:

Categories
Notes Old site

Spreadsheet error rates

Think about it for a moment. Do 10% of spreadsheets contain errors? Or 20% (for the pessimists among you)? These rates are high, and should be enough to make alarm bells ring, but the actual rates are probably far higher.

A few years ago Professor Ray Panko, at the University of Hawaii, pulled together the available evidence from field audits of spreadsheets. These are the results he shows:

Study Number of
spreadsheets
Number with
errors
Percentage
with errors
Coopers & Lybrand, 1997 23 21 91%
KPMG, 1997 22 20 91%
Lukasic, 1998 2 2 100%
Butler (HMCE), 2000 7 6 86%
Total 54 49 91%

More recently Lawrence and Lee analysed 30 project financing spreadsheets. All 30 had errors; the error rate was 100%.

It’s difficult to know how to interpret these results. They are certainly very high numbers, and send a chill down my spine. However, in terms of all the spreadsheets out there in the world, these error rates may be:

Understated
because not all the errors were caught in the audit. Spreadsheet reviewers and auditors are subject to human error like the rest of us, and depending on how long they spent on the audit may well have missed some of the errors.
because the sample of spreadsheets chosen for audit was biased. Possibly only those that were considered to be most important, and over which the greatest care had been taken, were selected.
Overstated
because the sample of spreadsheets chosen for audit was biased. Possibly only those that were considered to be most likely to have errors in were selected.
Not comparable
because different definitions of significant errors were used in the different studies.

Cell error rates

Other studies surveyed by Panko show that the error rate per cell is between 0.38% and 21%. These results are difficult to interpret: are they percentages of all cells, cells containing formulae, or unique formulae? (If a formula is copied down a row or column, it may count as many formula cells, but is only one unique formula). If we assume a rate of 1% of unique formulae having errors, and look at spreadsheets containing from 150 to 350 unique formulae, we find that the probability of an individual spreadsheet containing an error is between 78% and 97%. This is (obviously) a high number, but is reasonably consistent with the field audit results discussed above.

Lawrence and Lee found that 30% of the spreadsheets they reviewed had errors in over 10% of unique formulae; one spreadsheet had errors in more than one in five of unique formulae. Interestingly, this was the smallest spreadsheet, showing that error rates don’t necessarily increase with complexity.

Self confidence

To make matters worse, people tend to overestimate their own capabilities. Panko describes an experiment in which people were asked to say whether they thought spreadsheets that they had developed contained errors. On the basis on their responses, about 18% of the spreadsheets would have been wrong; the true figure was 81%. The actuary who told me “As far as I am concerned, none of my spreadsheets has ever had a bug in it” was probably deluding himself.

One source of this over confidence is probably lack of testing and thorough review. If you don’t think that your spreadsheet has errors in, you may not bother testing it, and so never find the errors. Nothing ever happens to make you revise your view.

Summary

It’s extremely likely that a large proportion of spreadsheets contain errors. People don’t realise just how large that proportion is, and also have misplaced confidence in their own spreadsheets.

Categories
Notes Old site

The true significance of bugs in spreadsheets

There are many reports of extremely high occurrence rates for bugs in spreadsheets. From reading them, you might think that very few spreadsheets are error-free.

However, many people who are aware of the likelihood of errors in spreadsheets go to great lengths to find and remove them. I have found few significant errors in the often large and complex spreadsheets I have reviewed (mainly in the insurance industry).

In my view the true significance of bugs lies not in their existence, which can lead to spreadsheets producing erroneous results, but in the enormous amount of time and effort that goes into preventing them. Spreadsheets are usually built and maintained by people who have little or no software engineering expertise. These people often:

  • Do not have good software development processes
  • Are not aware of the characteristics of good software and how they apply to spreadsheets
  • Do not know good methods of testing and reviewing software
  • Do not know how to design software (especially spreadsheets) so as to reduce the likelihood of bugs

I believe that the use of simple software engineering techniques can help. Some of these techniques are described, somewhat briefly, in other notes on this site.

Categories
Notes Old site

Criteria for good software

What is the difference between good software and bad software? It is reasonably easy to recognise the extremes when we encounter them, but there is a set of criteria that can be used to explain the differences. These criteria are also useful when building or modifying software, as they can be used to help make design decisions.

Good software has the following characteristics:

  • Correctness: it produces the right results and exhibits the right behaviour.
  • Usability: the user finds it easy to make it perform the desired functions.
  • Maintainability: it can be easily maintained and updated.
  • Availability: it is available (ie, will run) when it is needed.
  • Useful: it performs a task that someone wants performed.

It is often thought that performance should be another criterion, but in my view the need for good performance results from the usability and availability criteria.

These criteria apply to every kind of software, including spreadsheets. Spreadsheets are an interesting case because it is difficult to tell the difference between usability and maintainability, as one of their main uses is to be updated and enhanced.

Resources

The following external links are relevant:

Categories
Notes Old site

User-developed software

User-developed software is, as its name suggests, software that is developed by users rather than by specialist developers. It includes spreadsheets, parameter driven financial models, personal databases, VB code, and so on.

Caution: user-developed systems may be hazardous to your organization
Davis, 1981

User-developed software has many advantages, and can really leverage the expertise of those users. You, the users, maintain control over the system being developed; you hope for rapid turnaround on modifications; and a whole layer of communication is removed from between the concepts being modelled and the people doing the implementation.

However, the expertise of the user is unlikely to included extensive software engineering experience. Many systems developed by users end up being very large and complex, and are not as easy to maintain and enhance as they should be. Moreover, as they are often not subject to rigorous quality controls, they may contain significant bugs, be hard to use and lack robustness.

Resources

The following external links are relevant:

Categories
Notes Old site

Modularity

Modular systems are usually easier to maintain and update, as well as being faster to develop in the first place. So what is modularity?

A modular system is one that consists of a number of small, coherent, chunks (or modules) each of which is self-contained and has a well-defined interface.

Modularity is the single attribute of software that allows a program to be intellectually manageable.
G Myers, 1978

Modularity helps to reduce complexity. A modular system can be understood in terms of the individual modules, without having to grasp all the details at once. Modularity thus provides abstraction.

If the module boundaries are drawn appropriately, modifications to the system are often confined to one or two modules, thus simplifying the modification process.

Another advantage of modularity is that it facilitates code reuse. Again, if the modules are well defined, a single module can be used in several different systems, thus reducing the need to reinvent the wheel.

Categories
Notes Old site

Abstraction

The notion of abstraction is extremely important for writing good programs or designing good systems. Basically, the idea is that you want to avoid getting hung up on the details: you want to separate the things that change from those that
don’t
.

In Excel spreadsheets the easiest example of abstraction is the use of names. You can define a name to refer to a particular range: for example, you could define VATrate to refer to cell D3 on sheet TaxCalcs. By using a name for this, you have abstracted away two levels of detail:

  • The actual VAT rate to use; instead you use the value in cell D3 on sheet TaxCalcs.
  • The actual cell that contains the value; instead you use the name VATrate.

Using abstraction usually has advantages for maintainability, and we can see several of them in this example.

  1. A formula that reads =B26*(1+VATrate) is much easier to understand than either =B26*(1+$D$3) or =B26*1.175
  2. If you want to change the rate of VAT that you are assuming, it is much easier to change the value of a single cell than it is to go through all the formulae in the workbook looking for places where 17.5% is used, and changing them all individually.
  3. If you want to change the layout of your sheet, it is much easier to change the definition of the name VATrate than it is to change all references to cell D3, or to make sure that you drag and drop the cell or cut and paste it in such as way as to preserve all the references.

There are other ways of making use of the advantages of abstraction, but the use of names is the most common in Excel.