Categories
Notes Old site

Who benefits from documentation

Just about everybody benefits from clear, accurate documentation. The benefits of out of date documentation written without a specific purpose in mind are less obvious, and indeed are often non-existent. To get the most out of documentation, the following should be true:

  • It should be written specifically as specification, record, explanation or instruction. That way it will be easy for the reader to understand.
  • It should be easily available to anybody who wants it; this often (but by no means always) means that it should be part of the spreadsheet that it applies to.
  • It should be kept up to date, otherwise it might mislead the reader.
  • When writing documentation, it is often helpful to bear in mind the specific type of user for whom you are writing.

It is not only the reader who benefits from documentation. The writer often gains a lot, too; if the writer is a developer, writing some of the documentation in advance can help to focus the mind and prevent false starts. Articulating your ideas can save you from many dead ends. The documentation process can often throw up bugs in the spreadsheet or ambiguities in the specification, especially if the writer is a user other than a developer of the spreadsheet.

Appropriate documentation will help people other than the developer have confidence in the results of a spreadsheet. They will be to able to tell what the spreadsheet is intended to do, how it does it, what data it uses, how to use it and interpret the results, and what tests and reviews have been performed.

Resources

The following external links are relevant:

Categories
Notes Old site

Forms of documentation

Documentation may take many different forms:

  • A separate document is often used for long, formal specifications that are themselves subject to review and sign-off. Records of changes and versions may also be kept in a separate document or database. The potential disadvantage of a separate document is that it may be difficult to maintain consistency between the documentation and the spreadsheet.
  • Implicit documentation is widely used. The names of worksheets, ranges and cells, and modules and variables in VBA code come into this category. Formatting may also provide documentation, for example if colours are used consistently to indicate which cells are inputs, or to indicate potential errors.
  • Documentation within the code is perhaps the most common form. It is particularly easy to use in spreadsheets (compared to other types of software) as it often simply takes the form of text in cells. It is well suited to instructions and some types of explanation, as it can be placed close to the cells to which it refers.
  • Documentation as a separate block in code, for example as a separate worksheet, can be very useful, especially for keeping records.
  • Documentation in the user interface overlaps with documentation within the code for spreadsheets, but is clearly distinct in more conventional software. It includes text in user forms, text boxes and other images.

The most appropriate documentation method depends on the type of documentation and the user for which it is intended, as well as the culture in which it will be used. If your team commonly uses separate documents or centralized systems for specific types of documentation, then you should conform to the common practice.

Resources

The following external links are relevant:

Categories
Notes Old site

Who is helped by what documentation?

There is no type of documentation that is always useless to any type of user, but in general some types are more useful than others. The following table summarizes the general utility of the different types of documentation to different users:

Specify Record Explain Instruct
Viewer X X X X
Player X X X X
Tester X X X X
Changer X X X X
Developer X X X X
Auditor X X X X

The table can form the basis of a useful checklist when reviewing the presence and adequacy of documentation.

Specify

A fairly abstract level of specification is useful to everybody: “This spreadsheet models the effect of inflation on sales” for example. Some users need rather more detail than that: a Changer or Developer needs full details of the theoretical model that should be used, so that they can implement it, and a Tester needs those details so that they can see whether it has been implemented. You can’t tell whether a spreadsheet is doing the right thing unless you know what the right thing is. Even a Viewer or Player may need to know the theory behind the spreadsheet so that they can interpret the results.

Record

A record of what was done is especially useful to an Auditor, but is also helpful to Changers and Developers, who may need to work out why things are going wrong. However, a record is rarely a good substitute for either explanation or instruction. Recording documentation may be a simple narrative of steps taken, or a more formal record of versions, changes made, reviewers, tests performed, and so on. Information about the sources of data or parameters may count as either recording or explaining documentation.

Explain

Explanations of how the spreadsheet is put together, or why specific design decisions were made are usually primarily intended for Changers and Developers, but are also useful to Auditors. Explanations of the sources of data or parameters may be useful for all users. A simple narrative of “what I did when building this spreadsheet” is rarely useful as an explanation, especially as any information it gives may be superseded later on in the narrative. Explanations of the significance of outputs are useful to all users, especially Viewers or Changers who may not have the skills necessary to infer what is going on from the formulae or code.

Instruct

Instructions may be directed at any type of user. They are especially important for Viewers and Players, who may not always be able to infer what should be done from the structure or code in the spreadsheet itself.

Resources

The following external links are relevant:

Categories
Notes Old site

Types of user

A spreadsheet may have the following types of user:

  • A Viewer looks at the results, but makes no changes. A Viewer may never see the actual spreadsheet, but only printouts of selected parts, but they are still using the spreadsheet.
  • A Player changes input cell values, but doesn’t change formulae and layouts. They may execute macros.
  • A Changer changes formulae and makes minor layout and formatting changes. They correct and enhance the spreadsheet, without making major changes.
  • A Developer makes major changes, writes VBA code, designs and implements the spreadsheet from scratch.
  • An Auditor may change input values. They must understand the working of the spreadsheet and be able to trace the inputs and outputs.

A Reviewer may share the needs of any of the other types of users, depending on what aspects of the spreadsheet they are reviewing. A Tester is essentially like a Player: their role is to check the operation and results of the spreadsheet, without making changes to its operation. For simple spreadsheets there may be no effective difference between many of these roles.

The same person may use the spreadsheet in several different ways during its lifetime. Users are characterized by the role they are playing at the time, rather than by what role they may be capable of playing.

Resources

The following external links are relevant:

Categories
Notes Old site

What documentation does

Spreadsheet documentation may do any of the following:

  • Specify the intended working of the spreadsheet
  • Record what was done
  • Explain how the spreadsheet works
  • Instruct the user how to use or update the spreadsheet

Ideally, a spreadsheet would have documentation serving all four purposes.

A specification may be anything from a single sentence to a separate, long, document. In essence, it describes the theory of the spreadsheet rather than the implementation.

A record of what was done does not necessarily contribute much to the understanding of the current state of the spreadsheet, if it describes changes that were made much earlier in its life.

An explanation describes the implementation of the spreadsheet. Instructions are often used to remind users how and when to perform manual processes, such as running a macro, or what inputs are required.

Documentation is usually easier to understand if it is written for a single purpose. It really helps a user if instructions are written (and labelled) as instructions, rather than buried deep in the record of a change.

Resources

The following external links are relevant:

Categories
Notes Old site

Bookmarks in Excel

Wouldn’t it be nice if you could create bookmarks in Excel, in the same way that you can bookmark pages in your browser? The ideal would be to record the exact screen position, so that you could automatically return to the same view at some point in the future.

Well, you can! But they are not called bookmarks. Instead, you can use Custom Views.

Custom Views

To define a Custom View, set things up so that they look the way you want them to look. The following settings are recorded when you define a Custom View:

  • The currently active worksheet
  • The size of the window, zoom percentage, position on screen
  • Headings that have been set using Freeze Panes
  • Hidden sheets

You can choose whether to record the following settings:

  • Print settings
  • Hidden rows, columns and filters

When things look the way you want them to, open the Custom Views dialog from the menu, using View\Custom Views...

Picture of Custom Views dialog

Then click on the Add button, to show the Add View dialog.

Picture of Add View dialog

Select the options you require, type in a name, click on OK, and your new Custom View has been defined.

To view it again, bring up the Custom Views dialog and select its name in the list. You can either double click on the name, or select it and click on Show.

Defined Names

You can also use Names, by defining a Name for a range. You can then go to that range by selecting it in the Name Box.

Picture of Name Box

When you choose a Name from the Name Box, Excel adjusts the view so that the range to which the Name refers is visible on the screen. If the range is too large, as much as possible of the range, starting at the top left hand corner, is shown. In general it appears that Excel changes the view as little as possible, so that often single cell ranges are shown towards the bottom right of the screen.

It is only possible to choose Names that are defined in the current workbook. Names that are local to a worksheet are shown only when you are already viewing that worksheet.

It is somewhat more convenient to define a Name by typing in the Name Box, and then viewing the range by selecting it in the Name Box, but you don’t get the fine control over what you see that you do when you use Custom Views.

Categories
Notes Old site

Lookups in Excel

The HLOOKUP() and VLOOKUP() functions are two of the most commonly used functions in Excel, but there are a number of pitfalls that should be avoided.

Both functions take either three or four arguments (the fourth is optional):

  1. lookup_value the value that you are trying to match.
  2. table_array the range in which you are trying to look up the value. The function will try to match your lookup_value in the leftmost column (VLOOKUP) or topmost row (HLOOKUP) of this range.
  3. col_index_num (VLOOKUP) or row_index_num (HLOOKUP) the column or row containing the corresponding values you want to find. The numbering starts at 1 for the column or row containing the values to be matched.
  4. range_lookup whether to perform an exact or approximate lookup. If this is TRUE you’ll get an approximate lookup, if it is FALSE you’ll get an exact lookup. This arrgument is optional: if no value is specified, it will be assumed to be TRUE.

When should you use an approximate lookup, and when an exact lookup? It’s easier to decide if you understand what is going on as the lookup is performed.

Approximate lookups

If you are doing an approximate lookup, Excel will take each cell in the leftmost column (VLOOKUP) or topmost row (HLOOKUP) in turn, and see how it compares to the value it is looking for (lookup_value). If the value of the current cell is the same as the lookup_value, it stops, having found the correct row (VLOOKUP) or column (HLOOKUP). If the value of the current cell is less than the lookup_value, it moves on to the next cell. If the value of the current cell is greater than the lookup_value, it stops, and uses the previous row (VLOOKUP) or column (HLOOKUP). In other words, Excel returns the next largest value, as long as the lookup range is sorted in ascending order.

This provides a way of remembering the significance of the optional argument: use TRUE when the lookup range is sorted, and FALSE when it is not sorted.

Using TRUE when the lookup range is not sorted will lead to unpredictable results: for VLOOKUP, the row that will be used to provide the value will be the row above the first row whose leftmost column has a value greater than lookup_value.

If the lookup_value is smaller than the first value in the leftmost column (VLOOKUP) or topmost row (HLOOKUP), Excel will return the #N/A value.

Exact lookups

When Excel performs an exact lookup, it will take each cell in the leftmost column (VLOOKUP) or topmost row (HLOOKUP) in turn, and see how it compares to the value it is looking for (lookup_value). If the value of the current cell is the same as the lookup_value, it stops, having found the correct row (VLOOKUP) or column (HLOOKUP). If it gets to the end of the column (VLOOKUP) or row (HLOOKUP) without finding a match, it will return the #N/A value.

Approximate or exact?

You should always specify the optional fourth argument to these functions, so that it is obvious which sort of lookup you intend.

Whether you use an approximate or exact lookup will, of course, depend on the use to which you are going to put the values that you are looking up. However, there are some situations that you should avoid:

  • Don’t use an approximate lookup if the lookup table isn’t sorted in ascending order by the values in the leftmost column (VLOOKUP) or topmost row (HLOOKUP).
  • Don’t use an exact lookup if your lookup values are floating point numbers (ie, any numbers with decimal points). This is because floating point numbers have to be exactly the same in order to match: 4.9999999999999999 won’t match 5. Remember that even if a number is displayed as 5.0, it may actually be 4.9999999999999999.
  • Be very careful when using floating point numbers even with approximate lookups. 4.9999999999999999 is less than 5, so Excel wouldn’t find the correct row even with an approximate lookup.

Calculation speed

Lookups can take a long time to compute, if the lookup table is large or if many lookups are performed. Approximate lookups are faster, as Excel doesn’t always have to go through every row or column in the lookup table. It is often possible to use an approximate lookup, if the lookup table is sorted and is known to contain all possible values.

Many people use a formula such as =IF(ISNA(VLOOKUP(Currency,Rates,2,FALSE)),0,VLOOKUP(Currency,Rates,2,FALSE)). Because of the way Excel works, this always does both lookups in the formula, thus doubling the calculation time. It is much more efficient to have a column (or row) to contain the raw lookup results, and then test them with the ISNA() separately.

Categories
Notes Old site

SunTrust earnings restatement

In late 2004 one of the big US banks, SunTrust Banks, announced a restatement of earnings for the first two quarters. This followed problems they had found with their loan loss allowances, or more specifically, with the model they were using for their loan loss allowances. Part of their press release says: “There were numerous errors in the loan loss allowance calculations for the first and second quarters, including data, model and formulaic errors.” In other words, they are saying that the data that went into the model was wrong, the model itself was not a good fit to reality, and on top of that they hadn’t even implemented this faulty model properly. That covers pretty much everything that can go wrong with a model, if you count data as including parameters (see my article how to believe your models).

The fall out from the modelling problem was definitely non trivial. Q1 earnings were restated by 1%, Q2 earnings by 6%. In Q2, the loan loss allowances changed by 90%. The loan loss allowances had been overestimated, so this means that in the second quarter they were out by an order of magnitude. Three people lost their jobs as a result of the problem, including the Chief Credit Officer. The Financial Controller was reassigned to a position “with responsibilities that involve areas other than accounting or financial reporting” – ie, neither finance nor control.

Moreover, SunTrust’s directors were unable to sign off under section 404 of Sarbanes-Oxley at the next year end. They said that they were likely “not be able to conclude that the Company’s internal control over financial reporting was effective at such date.”

So why did all this happen? Well, apparently they were bringing in new processes and a new model in order to comply with Sarbanes-Oxley. This evidently proved more difficult than they anticipated. They say “The Company’s implementation of a new allowance framework in the first quarter was deficient. The deficiencies included inadequate internal control procedures, insufficient validation and testing of the new framework, inadequate documentation and a failure to detect errors in the allowance calculation.” They also point to deficiencies in spotting the problem, and then in doing something about it. In particular, “certain members of the Company’s management did not treat certain matters raised by the Company’s independent auditor with an appropriate level of seriousness.”

The morals are fairly obvious. First, models matter, and mistakes in models can be significant. Second, change is risky. It can be very risky. (On the other hand, not changing also has its risks). Thirdly, take problems seriously.

Resources

The following external links are relevant:

Categories
Notes Old site

Nevada City budget problems

In January 2006 it was discovered that there was a deficit of $5 million in the budget for Nevada City. The budget spreadsheet was the same as the one for 2005, updated for 2006. Apparently it was working correctly until sometime in late December, when it developed a problem. The difference was a $5M deficit in the water and sewers fund, that wasn’t present on 14 December but was there in the version discussed at a meeting on 3 January. Evidently some councillors noticed the change. The finance director hadn’t noticed, as he had a printed version that didn’t show the error. Once the error was noticed, it took the finance director about a day to fix it. As he was doing so, he found other errors.

Obviously the spreadsheet had changed after it had been printed out. We have no information about how the error was introduced. We are not told whether the other errors that were found and corrected were actually in the 2005 version of the spreadsheet, but it wouldn’t surprise me.

It appears that the actual Excel file was sent round to the councillors, and indeed posted on the city’s web site. This raises other concerns, unless it was protected so that it couldn’t be changed.

Categories
Notes Old site

Kodak earnings restatement

In November 2005 Kodak restated its Q3 results by $9 million. The restatement was attributed to restructuring and severance costs, plus a real estate gain. The restructuring costs were apparently because they got the accounting treatment wrong; we were told that “the magnitude of the worldwide restructuring program the company is undertaking imposes significant challenges to ensure the appropriate accounting”

There was also an error of $11 million in the severance calculation for just one employee. The error was traced to a faulty spreadsheet: apparently “too many zeros were added to accrued severance”. No payment was actually made to the employee in question (which was lucky for Kodak, but maybe unlucky for the employee). It sounds as if the error was either a simple data entry problem, or, possibly more likely, that the spreadsheet was expecting an entry in $’000 but got one in $.

This could be yet another example of a spreadsheet that is theoretically correct, but is not easy to use. If the wrong
information is used for the calculations, then the answers will be wrong: it’s our old friend, Garbage In, Garbage Out.