The Art of the Spreadsheet. Copyright 2008 John F. Raffensperger

1. Why is spreadsheet style important?
2. Make your spreadsheet read from left to right and top to bottom.
3. Omit unneeded bytes.
4. Omit unneeded sheets.
5. Organize blocks with care.
6. Attend to blank space.
7. Omit unneeded cells.
8. Format with caution.
9. Show all the information.
10. Spreadsheet errors.
11. How to audit a spreadsheet.
12. Suggestions for operations researchers.
13. Teaching the art of the spreadsheet.
Appendix. Checklist for a spreadsheet.

1. Why is spreadsheet style important?

You should worry about spreadsheet style if you write spreadsheets that must be correct, if you write spreadsheets that others must read, if you must audit other people's spreadsheets, or if you write spreadsheets that you hope to be able to re-use later. This book will not tell you how to use a spreadsheet. It will tell you how you should use a spreadsheet, if you want to communicate.

Throughout the 1980's and 1990's, academics and practitioners, mostly accountants, have called for a more careful approach to writing spreadsheets. Their proposals have been based on prevailing best practices in computer programming, which rely on a modular standard format of data, formulas, and output. Corporate information technology groups believe that standards - based on computer programming - must be imposed on users, so spreadsheets must be written in a “standard format.”

There is a lot of intuitive common sense to this, especially to professionals in decision support. Good programming practice should apply to spreadsheets. Unfortunately, as we shall see, research has found that most spreadsheet users do not have a computer background. Managers are hostile to perceived interference, and they use spreadsheets for the freedom from information technology departments.

This book gives a new style based on writing text, mathematics, and graphics. If we look to these existing, well-developed forms of art, we find a wealth of stylistic principles that can be applied easily to spreadsheets. Curiously, principles that we routinely apply to text, mathematics and graphics, seem to contradict “common wisdom” for spreadsheets. None of the principles are new, but they have never been applied to spreadsheets.

A spreadsheet is a mixture of text, graphics and mathematics. Take one of these elements away, and we have something other than a modern spreadsheet. If you take away the text and graphics, you have a calculator or a mathematical modeling language. Take away the math, and you have a presentation package or a word processor.

A spreadsheet is both a form of expression and a method of computation. Like other forms of expression, people use spreadsheets to present and convince others of ideas. Like other methods of computation, people use spreadsheets to solve problems and make decisions. And as with other forms of expression and computation, spreadsheets should have clear assumptions and logic, and should be written in a way to minimize the chance for mistake and misinterpretation.

Certainly in print, a spreadsheet is like any other form of printed expression. As with printed expression, the spreadsheet writer must rely on text, graphics, and mathematics to communicate. So rules of style for text, graphics, and mathematics apply to spreadsheets. For example, a spreadsheet will be easier to read if its text has proper spelling and grammar. It is more likely to be correct if the spreadsheet writer has basic algebra skills. The spreadsheet is more likely to be visually appealing if the writer knows graphic design.

Tufte, Edward, Envisioning Information, 1990, Graphics Press, Cheshire, Conn., p. 35.

Individually, the stylistic elements of a spreadsheet are not so hard to manage. The problem comes because the spreadsheet writer can put all three in one place. Most of us are not at once clever writers, graceful graphic artists, and brilliant mathematicians. In Tufte's words, “Display of closely-read data surely requires the skilled craft of good graphic and poster design: typography, object representation, layout, color, production techniques, and visual principles that inform criticism and revision.”

For example, suppose we wish to insert a new column in a spreadsheet, to add a new data field to one table. If we are not careful with the layout, we will disturb other tables below or above. In a worse case, adding a new column is impossible because we have gone beyond Excel's limit of 256 columns. Now we have to make a hard decision about the spreadsheet. Do I really need the column I want? Is there another column (perhaps a blank separator) that I can delete? It is likely that the best decision is to tranpose the table, which is a major reconstruction. What should have been a simple change in a data structure becomes a problem in graphical layout.

Writing clear text is hard. Writing clear mathematical prose is very hard. So it is no surprise that writing clear spreadsheets is hard, or that spreadsheets are often difficult to read, or that spreadsheets often contain mistakes. We are not just creating a mathematical model. We are also documenting, illustrating, and typesetting it.

People make lots of mistakes when writing spreadsheets; this been well documented. In an on-line article, Panko wrote:

Raymond Panko, “Applications Development: Finding Spreadsheet Errors,” www.iweek.com, 29 May 1995.

Our experiments and others found undetected logic errors in about 4% of cells. This actually corresponds favorably to error rates in software development. Numerous studies have shown that even experienced professional programmers make undetected mistakes in 3% to 7% of all lines of code before they begin the debugging stage.

However, programmers have long known the importance of deep debugging, running test data against their programs and doing detailed, line-by-line code inspection. Programmers also know how difficult it is to do debugging well. As a result, almost all corporations required deep debugging processes. These reduce errors to about two in every 1,000 lines of code.

Raymond R. Panko and Ralph H. Sprague, Jr., “Hitting the wall: errors in developing and code inspecting a 'simple' spreadsheet model,” Decision Support Systems, vol. 22, no. 4, April, 1998, pp. 337-353.
Raymond R. Panko, panko.cba.hawaii.edu/ssr/.
Saul J. Gass, “Model World: Danger, Beware the User as Modeler,” Interfaces, vol. 20, no. 3, May-June, 1990, pp. 60-64.

Panko and Sprague (1998) and Panko's web page document these problems further and provide bibliographies that include the popular press and practitioner literature. They cite overconfidence as a serious difficulty. Gass (1990) similarly observes,

When the user is also the modeler, the user will likely have confidence in the model. If I were the user-modeler, I certainly would believe in my model and use the solution. A clear danger! Where are the checks and balances?

Then he makes a case for model verification and validation.

Overconfidence is one of those things, like humility, that we never know when we've licked. The best we can do is to try to learn, and keep on learning.

A computer program with a bug in it will often crash, or misbehave in obvious ways. A spreadsheet with an erroneous formula will do no such thing. The error will be completely hidden unless you go looking for it. It will not crash your computer, it will not erase your files, but it may bankrupt you. So spreadsheets tend to inspire false confidence − the errors rarely announce themselves, so the writer rarely gets negative feedback.

What is the probability that your spreadsheet has an error? Panko's data suggests 4% per cell. The probability of zero errors is (1−0.04)(number of cells). A spreadsheet with just 17 cells hits the 50% mark. That is, for a typical user, a spreadsheet with 17 cells has a 50-50 chance of containing at least one error.

But consider the writer with lots of experience. (Overconfidence?) Assume his error rate is only 1%. He needs a spreadsheet of only 69 cells to hit the 50-50 point. And if his spreadsheet contains 2,000 cells (a paltry 40-by-50 table), the probability of zero error is 1 in 500 million.

Here is some easy Visual Basic code that will estimate the probability that the entire spreadsheet is correct, based on the number of formulas in the spreadsheet, and the assumption that about 4% of formulas are wrong. This estimation may be pessimitic on one hand, because Panko's data assumes a typical user, when of course we may have more self-assurance (however unwisely so). The estimation is optimistic on the other hand, because it assumes all inputs are correct.

Sub ChanceRight()
Dim i As Integer
Dim numberOfFormulas As Integer
Dim chance As Double

numberOfFormulas = 0
For i = 1 To Worksheets.Count
    numberOfFormulas = numberOfFormulas + _
    Worksheets(i).Cells.SpecialCells(xlCellTypeFormulas).Count
Next i

chance = Application.Round(100 * (1 - 0.04) ^ numberOfFormulas, 1)
MsgBox ("Based on studies of spreadsheet error, and on the number _
of formulas in this spreadsheet, the chance this spreadsheet is entirely _
correct is about " & chance & "%. _
There are " & numberOfFormulas & " formulas.")
End Sub
Tufte, Edward, The Visual Display of Quantitative Information, 1983, Graphics Press, Cheshire, Conn., p. 56.

Making a spreadsheet correct spreadsheets is hard, but making a spreadsheet readable is hard, too. The writer often has trouble making sense of his own work. Edward Tufte wrote, “Tables usually outperform graphics on small data sets of 20 numbers or less. The special power of graphics comes in the display of large data sets.” This suggests that using a spreadsheet to display a lot of information is a challenge.

Frankly, people do not enjoy reading spreadsheets. Spreadsheets are essentially tables of numbers, and tables of numbers can be murder boring. So should we dress them up, decorate them, add some catchy art, use plenty of color? No, this is not a good idea. The art of the spreadsheet is about decision-making and judgement, not entertainment. The aesthetics should aid judgement rather than entertain.

Good style helps reduce errors and aids readability. This book will show you how and why.

“Writing a spreadsheet is like writing a computer program”

If you go to the library and look up “spreadsheet style” in the various databases, you will find articles mostly in the accounting literature, a few in operations research (my own field), and a very few in computer science. Most of this literature uses metaphors based on programming itself. In effect, people who have written about spreadsheet style have in their heads the metaphor that “Writing a spreadsheet is like writing a computer program.”

Thus, most of these people recommend that a spreadsheet writer should divide the spreadsheet into modules of input, analysis, output, and documentation.

David Freeman, “How to make spreadsheets error-proof,” Journal of Accountancy, May 1996, vol. 181, no. 5, pp. 75-77.

The accounting profession is the most extreme about this modular approach, and they have been the most vocal. Accountants are perhaps the only profession to have as much concern about spreadsheet style as we operations researchers. The accountants show a preoccupation with standards, and they use a carrot-and-stick method for imposing their style on spreadsheet writers. First, they offer the carrot of perfection. For example, David Freeman's article (1996) is titled “How to make spreadsheets error-proof.” Second, they urge that companies impose a “standard” format on all spreadsheet users.

Bromley (1985) stated,

Robert G. Bromley, “Template Design and Review: How to Prevent Spreadsheet Disasters,” Journal of Accountancy, Dec., 1985, vol. 160, No. 6, pp. 134-142.
Dan N. Stone and Robert L. Black, “Using Microcomputers: Building Structured Spreadsheets,” Journal of Accountancy, Oct., 1989, vol. 168, no. 4, pp. 131-142.
Robert Kee, “Programming Standards for Spreadsheet Software,” CMA Magazine, April 1988, vol. 62, no. 3, pp. 55-60.
Robert Kee, and John O. Mason, Jr., “Preventing Errors in Spreadsheets,” Internal Auditor, Feb., 1988, vol. 45, no. 1, pp. 42-47.

Complicated templates will be easier to understand if they are broken down into meaningful sections. The ones described below are adaptations of the structured COBOL divisions recommended by the Conference on Data Systems and Language Standards...

Stone and Black (1989) recommended “techniques for building logically organized spreadsheets...founded on well-established practices used in writing computer programs.” They go on to recommend four sections: an introduction, inputs, calculations, and outputs. The introduction is supposed to have a title, file name, date, author, purpose, assumptions, table of contents, and instructions.

Kee (1988, p. 59) states, “Like mainframe applications, spreadsheet software should be documented both internally and externally.” Regarding documentation: “For large or complex templates, a flowchart of the template's logic may be needed to verify that all steps in data processing are incorporated into the template's logic.”

Kee and Mason (1988) wrote

A large proportion of spreadsheet applications is used for relatively minor and user-specific tasks that expose the firm to little risk, so these applications need only minimal controls. Conversely, spreadsheet programs that influence crucial decisions, impact multiple areas, or are used to prepare financial and other key reports need controls comparable to those of mainframe applications...

For large and/or complex templates, it is often easier to develop a new spreadsheet than to attempt to revise a poorly structured or documented model. Standardization of a firm's spreadsheet design may aid in resolving many of these problems. Standardization imposes a common structure on spreadsheet design, thereby forcing a greater degree of planning and documentation on template developers.

Then they recommend six sections to a spreadsheet: description, data entry, data validation, formula/output, and documentation.

John L. Crain and William C. Fleenor, “Standardizing Spreadsheet Designs,” CPA Journal, Oct., 1989, vol. 59, no. 10, pp. 81-84.
D. Mather, “A framework for building spreadsheet based decision models,” Journal of the Operational Research Society, 1999, vol. 50, pp. 70-74.

Crain and Fleenor (1989) wrote

The standard setting process should be a joint effort by all spreadsheet users. Drafts of proposed policies and procedures should be circulated for review by the people affected, with their ideas integrated where possible.

Alas, even we operations researchers have fallen prey to the idea of spreadsheet-as-program. Mather wrote,

As in all aspects of programming, the way one chooses to develop a spreadsheet model is very subjective.

So he proposed that the spreadsheet writer first prepare a flow chart of the information.

Jeffrey L. Bissell, “Spreadsheet Planning and Design,” Journal of Accountancy, May 1986, vol. 161, no. 5, pp. 110-120.
W. R. Edge and E.J.G. Wilson, “Avoiding the Hazards of Microcomputer Spreadsheets,” Internal Auditor, April 1990, vol. 47, no. 2, pp. 35-39.

Some articles (Crain and Fleenor 1989, Bissell 1986, Edge and Wilson 1990) contributed little more than recommendations that every spreadsheet have various mixes of the following:

With all this in a spreadsheet, it is big before we start! We already know that a big spreadsheet is more likely to have errors than a small one. About the only thing that I agree with is, “no constants in formulas.”

David Freeman, “How to make spreadsheets error-proof,” Journal of Accountancy, May 1996, vol. 181, no. 5, pp. 75-77.
W. R. Edge and E.J.G. Wilson, “Avoiding the Hazards of Microcomputer Spreadsheets,” Internal Auditor, April 1990, vol. 47, no. 2, pp. 35-39.

The accountants have some good ideas for the details, but their ideas for overall structure make a spreadsheet hard to navigate and hard to read, and are more likely to produce errors rather than eliminate them. Perhaps the worst advice from literature is for the now-outdated stair step structure (Freeman 1996, Edge and Wilson 1990). At least I hope and pray that it is outdated. The stair step structure is a block diagonal matrix with blanks off the diagonal.

The accountants recommend a stair step structure under “basic design” for input, calculation, and output, a block for each section, to avoid problems if a row or column were deleted. But this idolizes modularity at the expense of readability. First-generation spreadsheets had only one sheet per file, but with more recent software, these “steps” at least can be put on different sheets in the same file. This causes other problems. (More on this later.)

What have the operations researchers and management scientists said about spreadsheet design? Surprisingly little, given the volume of work written with spreadsheets. Little of the operations research literature recommends any particular structure, even that of data, analysis, and output.

A common type of operations research model is the linear program. Roughly speaking, a linear program is a set of simultaneous linear equations, like the two-equations-in-two-variables that we all learned in grade school. The linear program is an archetype of decision-making - the real world problem is reflected in its structure. Its structure is made up of decisions (the variables), constraints, and an objective. Usually special solver software is needed to solve a linear program.

A spreadsheet is a marvelous way to write and solve a small or medium-sized linear program. Sometimes people write a spreadsheet that corresponds to a linear program without being aware of it. Linear programs in spreadsheets defy conventional wisdom for spreadsheet style.

A decision variable is a constant in a spreadsheet, but it is calculated by an external solver. Is this data that belongs in a data block? Not really. The spreadsheet with a linear program is no longer a computer program, but a set of equations to describe an optimization problem. An operations researcher may intuitively feel that the metaphor of spreadsheet-as-computer-program becomes uncomfortable.

John S. Edwards and Paul N. Finlay, Decision Making with Computers: the Spreadsheet and Beyond, Pitman Publishing, London, 1997.

Edwards and Finlay (1997) wrote a book on modeling with spreadsheets. Their good advice generically applies to almost any kind of modeling, such as “The more important an issue, the greater the potential value of the spreadsheet.” (p. 46). They advise the writer to check that factors have consistent dimensions and units of measure, and to check the range of application of relationships (pp. 114-116). While Edwards and Finlay do a good job discussing the capabilities of spreadsheets, they have little fresh to offer for spreadsheet style.

B. Ronen, M.A. Palley, and H.C Lucas, Jr., “Spreadsheet Analysis and Design,” Communications of the ACM, vol. 32, no. 1, pp. 84-93.

Ronen, Palley, and Lucas (1989), actually computer scientists rather than operations researchers, gave three design objectives, and made a case for readability.

A spreadsheet should produce reliable results; the output it generates should be correct and consistent.

A spreadsheet should be capable of being audited; the user should be able to retrace the steps followed to generate different outputs from the model in order to understand the model and to verify findings.

A spreadsheet should be capable of being modified easily without introducing errors.

A final issue impacts the three listed above: comprehensibility. The designer and user should be able to easily understand the model and its assumptions as represented in the spreadsheet.

D. Mather, “A framework for building spreadsheet based decision models,” Journal of the Operational Research Society, 1999, vol. 50, pp. 70-74.

But the best they could offer was a “structured approach to spreadsheet design,” with spreadsheet-specific model flow diagram symbols, just like computer programming. They further recommended a particular format with owner, developer, user, date, file name, a map of the model, parameters, input, formulas and output. Mather (1999) also proposed a model flow diagram, then a structure of Manager Page, Constants Page, Calculation Page, and Graphs Page.

Paul B. Cragg and Malcolm King, “Spreadsheet modelling abuse: An opportunity for OR?” Journal of the Operational Research Society, Aug., 1993, vol. 44, no. 8, pp. 743-752.

Cragg and King (1993) studied twenty specific spreadsheets in ten firms. They saw the importance of layout.

A key feature of sound development and use of a spreadsheet system is good layout. In particular, careful separation of sections of the model is important so that it is clear to others what the spreadsheet system is doing...

Although the majority of the spreadsheet systems analyzed were found to have some separation of sections (e.g. input and output) only 55% had a clear separation. The others had sections of input and output scattered throughout the model.

No model separated the calculations from the rest of the model. The calculations tended to be mixed with the output.

Many of the systems (60%) had a separate section from which reports were printed. These were often very clear, though only five used facilities such as bold, to enhance their output.

The lack of separation in the remainder of the systems made them very difficult to understand.

They briefly gave a few other ideas, using cell protection, version identification, absolute references, range names, and validation of input data.

To summarize, the current style specifies a modular “standard format” with the modules generally being data, formulas, and output. The promise is, “If you use modular spreadsheets, you will avoid errors.” The requirement is that “People must be forced to used standard formats.” The mindset is “Writing a spreadsheet is like writing a computer program.”

Managers do not want the current style

Paul B. Cragg and Malcolm King, “Spreadsheet modelling abuse: An opportunity for OR?” Journal of the Operational Research Society, Aug., 1993, vol. 44, no. 8, pp. 743-752.

Cragg and King found that standards are unlikely to work. They found that 60% of the models were used solely by their creator, managers used spreadsheets for the freedom from internal Information Technology groups, models were made and refined iteratively, only one-third of the model builders had training in programming (with no difference in model quality with or without training in programming), and “design” took place with only eight of the twenty models (and that “rudimentary”). Furthermore, there was “an average model creation time of three days,” so “it was considered a waste of time spending one or two days on documentation.”

In light of all this, they state

Managers like spreadsheets because they reduce their dependence on others ... and allow them to do their own thing, without having to be bothered with much planning. Thus, anyone offering help or support could be viewed with suspicion. Attempts to encourage standard methods or to enforce specific procedures are likely to be greeted with hostility.

Likewise, Ronen, Palley, and Lucas (1989) stated,

B. Ronen, M.A. Palley, and H.C Lucas, Jr., “Spreadsheet Analysis and Design,” Communications of the ACM, vol. 32, no. 1, pp. 84-93.
Steve J. Davis, “Tools for spreadsheet auditing,” International Journal of Human Computer Studies, Oct., 1996, vol. 45, p. 429-42.

In general, these users have not been trained in systems analysis and tend to overlook concerns of the professional systems analyst in designing a system, such as reliability, auditability, and control. In fact, the spreadsheet user is often happy to avoid systems professionals.

Davis (1996) wrote

Many people turn to spreadsheets to avoid analysis and design formalities so as to produce an application as quickly as possible. It is unlikely that many spreadsheet users would follow any design procedure that would slow them down.

Instead of specifying a standard format, Davis developed spreadsheet auditing tools.

So spreadsheet users do not use and do not want the modular style, and they do not want imposed standards.

D. G. Conway and C. T. Ragsdale, “Modelling optimization problems in the unstructured world of spreadsheets,” OMEGA, 1997, vol. 25, no. 3, pp. 313-322.

A careful literature search produced a single paper with fresh ideas on writing spreadsheets, something other than “have modules for input, analysis, and output, and avoid constants in formulas”. Conway and Ragsdale (1997) first trash the idea of a standard format as an attempt to “apply principles from traditional information systems (IS) design standards to the spreadsheet environment”, particularly when applied to optimization problems. They point out the inappropriateness of sacrificing communicative effectiveness for the sake of conformance to a “standard format,” with first a nod to its motivation.

The proliferation of spreadsheet software represents somewhat of a nightmare to corporate IS professionals as many non-IS managers have adopted a “Do-It-Yourself” approach to their own systems and modeling needs...Thus it is not surprising that some IS professionals have called for standards to guide spreadsheet development.

Robert Kee, “Programming Standards for Spreadsheet Software,” CMA Magazine, April 1988, vol. 62, no. 3, pp. 55-60.

They cite Kee (1988).

'Reliable spreadsheet software begins with a standard format for developing spreadsheet applications...' However, contrary to Kee's sentiments (given above), for many optimization problems we find that the forced use of a standard format results in spreadsheets models [sic] that are more difficult to construct, less reliable, and more difficult to understand.

Prescriptions for spreadsheets are like prescriptions for printed documents. A printed document has a specific format for a specific purpose. A tax form must have a specific format for all users; this makes sense. But we would not think therefore that all printed documents must follow a standard format like a tax form. The family budget should not have the same format as the corporate budget.

So we have seen the prescription given by the academics and accountants. Further, we have seen that people do not want to use that prescription, and really do not want it imposed on them. The amazing thing is that there is no evidence that the input-formulas-output style is effective, or reduces the number of errors. Indeed, we shall see that there is good reason to believe that the prescription that has been preached for so long may itself be the cause of a good many problems.

What style should a spreadsheet have?

Almost every other form of media has publicly recognized examples, even prizes, for good style. Literature has the Nobel Prize, film has the Academy Awards, mathematical modeling has the Franz-Edelman Prize, and so forth. Nothing like this exists for spreadsheets. There does not exist a publicly recognized outstanding set of spreadsheets that can serve as great examples. Tufte put it this way:

Tufte, Edward, Envisioning Information, 1990, Graphics Press, Cheshire, Conn., p. 9.

Despite the beauty and utility of the best work, design of information has engaged little critical or aesthetic notice: there is no Museum of Cognitive Art.

All we spreadsheet writers have are these miserable accounting articles that tell us to use a standard format and not to put constants in formulas.

William Strunk and E.B. White, The Elements of Style, third edition, 1979, MacMillan Publishing Co., Inc., New York.
Nicholas J. Higham, Handbook of Writing for the Mathematical Sciences, 1993, SIAM.
George B. Dantzig, Linear Programming and Extensions, Princeton University Press, 1963.
Linus Schrage, Optimization Modeling with LINGO, Lindo Systems, Inc., 1998.

So what is left for spreadsheet style? Well, what is done in other arts? Writing, graphics, and mathematics have histories of thousands of years, the spreadsheet no more than fifteen or twenty. Without a Shakespeare, a Michelangelo, or an Euler to guide us, to develop a style for a new medium we must look to what we already know. Instead of standard formats, writers rely on examples of well-done work and manuals of style. The latter include Strunk and White for text, Tufte for graphics, and Higham for mathematics. We can add Dantzig, Schrage, or any other great operations researcher for modeling.

In planning a spreadsheet, a flow chart may help. I would hasten to say I would not discourage their use. Never have I used one, though. And with object-oriented programming, computer programmers are not using flow charts either.

So should we not formally plan a spreadsheet? Again, I am in favor of virtue. However, a spreadsheet should be revised and edited for clarity, like any document. An initial plan is not as important as a polished product. We have seen that managers will not follow corporate standards. So instead of standards and formal plans, this book gives common sense and aesthetics.

Conway and Ragsdale (1997), after slamming the idea of a standard format, wrote,

D. G. Conway and C. T. Ragsdale, “Modelling optimization problems in the unstructured world of spreadsheets,” OMEGA, 1997, vol. 25, no. 3, pp. 313-322.

In most cases, we believe the spreadsheet design which communicates its purpose most clearly will also be the most reliable, auditable and modifiable design.

Their new proposals are (1) that logically related formulas should be in physical proximity, and (2) that the way we read should be a factor in how we write spreadsheets. All previous works make recommendations that prevent such structure, or ignores these aspects of layout altogether. Conway and Ragsdale were the first to write of modern spreadsheet style.

Spreadsheets should be written for readability

Computer code is not written primarily for display, but a spreadsheet is meant to be seen and read. It is true that a spreadsheet has the three abstract parts, data, calculation, and output. However, this functional structure was originally designed for the computer's needs of batch-processed punched cards of the 1950's, '60's and '70's, rather than the reader's needs.

The input-calculation-output structure implies the calculations are a black box that the reader should ignore rather than read. The writer wants to be trusted that the formulas are right, rather than take the time and effort to make the formulas clear. The evidence tells us that we should not trust the writer.

Paul B. Cragg and Malcolm King, “Spreadsheet modelling abuse: An opportunity for OR?” Journal of the Operational Research Society, Aug., 1993, vol. 44, no. 8, pp. 743-752.

The spreadsheet-as-computer-program metaphor is further likely to be inadequate, because most spreadsheet writers do not have a background in data processing (Cragg and King 1993). This metaphor is one that writer and reader are unlikely to expect, understand, or want. Using Sheet 1 for data and Sheet 2 for calculations sounds good but is a bad idea. A reader will not want to read a computer program, and spreadsheet writers do not know computer programming conventions anyway.

Ironically, programmers no longer use the input-calculation-output style for programming. In object-oriented code, each module is related to a real-world object.

A spreadsheet cannot have an “output” section in the programming sense. Output must be a formula.

Instead of “data and calculation” or even “decision variable and constraint,” the reader will be thinking in terms of the business problem, such as “person, shift assignment, preference.” This suggests that a spreadsheet should be organized to follow the business logic. Furthermore, the spreadsheet should be written with readability as the primary goal. To do this, we need a new metaphor to replace “The spreadsheet is a computer program.”

If we think of a spreadsheet as a mix of text, graphics, and mathematics, and if we think of writing a spreadsheet as writing a technical document with those elements, then we have well-developed rules of style already available.

Throughout, this book contrasts the spreadsheet-as-program style with the new spreadsheet-as-writing style. Some readers may find the new metaphor controversial or rigid. In some ways, the new style is more demanding than the older style, but it is a style, not a fixed format. Feel free to make exceptions. As one follows style rules of text with care, breaking those rules where appropriate, rules of spreadsheet style should not be applied without thought. Mainly, this book is intended to sensitize you to principles of spreadsheet readability.

In the following, numeric cell means a formula or a constant referenced by formulas in other cells. A constant unreferenced label will be treated as nonnumeric, even if it is a number such as a year “2003.”

The new spreadsheet style

The key to writing spreadsheets is: Write a spreadsheet as text, mathematics and graphics are written. From this come general rules, and the general outline for this book.

Design consistent direction for the reader to follow.

Reduce distance between related pieces of information.

Build context by concentrating related data.

Amplify the most important signal that we want to send.

Erase noise that might obscure the signal.

Given alternatives for writing a particular spreadsheet, choose the one that makes the spreadsheet compact, scalable, and predictable. Fortunately, these objectives tend to coincide.

How do we implement these general rules? By following these steps:

  1. Make your spreadsheets read from left to right and top to bottom.
  2. Be concise with sheets.
  3. Be concise with blocks.
  4. Be concise with blank space.
  5. Be concise with cells.
  6. Format for description, not decoration.
  7. Show all the information.

Later, we will discuss spreadsheet errors and show how to audit a spreadsheet.

In effect, the new paradigm for spreadsheets says that a spreadsheet is a technical document, and should be written as a document. The analogy to technical writing does not always work, because a spreadsheet is a synergistic combination of the old arts, and there is indeed computer programming involved. Still, the analogy to these arts might be more appealing to people with arts degrees, who are not in a decision support profession, but still have to write spreadsheets. But I would like to say something a bit stronger. My proposal today is that we may be able to fix some spreadsheet problems by using these older rules of style, and that indeed, some common errors are caused by breaking the rules for text, graphics, and mathematics.