I said in the first part of this series that one of the books I wanted to talk about was written in 1974. My colleague Dennis Schafroth guessed that it might be Kerninghan and Ritchie’s classic, The C Programming Language, but the first edition of that book actually came out in 1978. But I awarded half a point anyway, because the book I had in mind (A) was co-written by Brian W. “Water buffalo” Kernighan, and (B) had a second edition in 1978, the same year as K&R. It is Kernighan and Plauger’s The Elements of Programming Style (amazon.com, amazon.co.uk)
(This isn’t a cover image from Amazon, it’s a scan of my personal copy, because I wanted you to see how well-thumbed it is.)
What can a book from 1974 possibly have to teach us 36 years later? Especially when all its example code is in FORTRAN(!) and PL/1(!!)? A lot, as it turns out. For one thing, it contains (on page 10) perhaps the single wisest thing that has ever been said about programming:
“Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?”
It’s a short book: at only 168 smallish pages, it’s less than a fifth as long as the bloated 938-page tome I have about XSLT, and about one third as long as Fowler et al.’s Refactoring, which is perhaps its spiritual heir. It’s arranged as eight chapters: Introduction, Expressions, Control Structure, Program Structure, Input and Output, Common Blunders, Efficiency and Instrumentation, and Documentation; each chapter demonstrates ten or so rules (which I will list below).
So far, so didactic: anyone can guess from this that EoPS is a useful book; what’s not so obvious is that it’s very funny. I’ll let K&P explain their approach (from the preface to the First Edition on page xi):
This book is a study of a large number of “real” programs, each of which provides one or more lessons in style. We discuss the shortcomings of each example, rewrite it in a better way, then draw a general rule from the specific case. [...] all of the programs we use are taken from programming textbooks. Thus we do not set up artificial problems to illustrate our points — we use finished products, written and published by experienced programmers.
This in itself is pretty hilarious: EoPS consists entirely of the mistakes made by people confident enough to publish their own programs as examples of good style. What lifts it to the realm of laugh-out-loudfulness is the very dry style: K&P graciously abstain from going to town on the deficiencies of the programs they study, but their minutely detailed dissections speak volumes, and seem (unless I am imagining it) to convey an undertone of profound disdain. For example, consider these observations on a program to calculate the area under a curve:
“With all the extraneous assignments removed, it is easier to see the underlying structure. It is also easy to see that the indentations reflect little of what is going on. But what is the purpose of the variable I? It is laboriously kept equal to J so that OUT can be called at the end of the last iteration. Clearly I is not needed, for J could be used for the test. But the test is not needed; OUT could be called just after the inner DO loop has terminated. But OUT need not be called at all, for its code could just as well appear in the one place it is invoked. The structure simplifies remarkably.”
As another example, check out this section on commenting:
In other sections, fairly significant programs like a maze solver are taken apart, subjected to dispassionate but merciless criticism, and put together again shorter, clearer, more correct and more functional than before. In short, Kernighan and Plauger don’t just explain how to program well, they show us how it’s done.
I’m not going to claim that the book hasn’t aged. As you can see from the extract above, the typography looks very primitive (it was done on an early version of troff), and rules such as “Avoid the Fortran arithmetic IF” and “Initialize constants with DATA statements or INITIAL attributes; initialize variables with executable code” just don’t apply any more in the post-FORTRAN era. Another rule, “Write first in an easy-to-understand pseudo-language; then translate into whatever language you have to use”, is also not applicable now that languages like Python and Ruby can read and execute the equivalent of the old pseudo-code. This in itself demonstrates the value of one of K&P’s more enduring rules: “Let the machine do the dirty work”. Yes indeed: translating pseudo-code into executable code should not be left to humans.
But most of the rules are timeless, and remain as true and important today in 2010 as they were in 1974. The first rule after the introduction remains one of my favourites: “Say what you mean, simply and directly”. (You may not believe it, reading this blog, but I try to apply this to my prose writing as well as my programming.) Others that we should all try to live by: “Each module should do one thing well”; “Let the data structure the program”; “Make it right before you make it faster” and “Keep it right when you make it faster”.
So I keep coming back to EoPS (I am re-reading it as I write this) because it’s short, it’s easy reading, it’s funny, and much of its advice is timeless. In a way, you could say its age is even a plus-point, because it makes it obvious which of the rules are of their time and which are fundamental — whereas, for example, everyone knows that Design Patterns contains a mix of genuine insight and mere patches for Java’s lack of expressive power, but it’s not yet clear which patterns fall into which categories. Give it another twenty years, and we should be in a position to figure that out.
Appendix: “summary of rules” from EoPS
Abstracted from the appendix SUMMARY OF RULES in The Elements of Programming Style (Second Edition) by Brian W. Kernighan and P. J. Plauger, pub. McGraw-Hill, ISBN 0-07-034207-5.
This summary is designed to give a quick review of the points we covered in the book. Remember as you read the rules that they were presented in connection with one or more examples — go back and reread the pertinent section if a rule doesn’t call them to mind.
To paraphrase an observation in The Elements of Style, rules of programming style, like those of English, are sometimes broken, even by the best writers. When a rule is broken, however, you will usually find in the program some compensating merit, attained at the cost of the violation. Unless you’re certain of doing as well, you will probably do best to follow the rules.
- Write clearly — don’t be too clever.
- Say what you mean, simply and directly.
- Use library functions.
- Avoid temporary variables.
- Write clearly — don’t sacrifice clarity for “efficiency”.
- Let the machine do the dirty work.
- Replace repetitive expressions by calls to a common function.
- Parenthesize to avoid ambiguity.
- Choose variable names that won’t be confused.
- Avoid the Fortran arithmetic IF.
- Avoid unnecessary branches.
- Use the good features of a language; avoid the bad ones.
- Don’t use conditional branches as a substitute for a logical expression.
- Use the “telephone test” for readability.
- Use DO-END and indenting to delimit groups of statements.
- Use IF-ELSE to emphasize that only one of two actions is to be performed.
- Use DO and DO-WHILE to emphasize the presence of loops.
- Make your programs read from top to bottom.
- Use IF … ELSE IF … ELSE IF … ELSE … to implement multi-way branches.
- Use the fundamental control flow structures.
- Write first in an easy-to-understand pseudo-language; then translate into whatever language you have to use.
- Avoid THEN-IF and null ELSE.
- Avoid ELSE GOTO and ELSE RETURN.
- Follow each decision as closely as possible with its associated action.
- Use data arrays to avoid repetitive control sequences.
- Choose a data representation that makes your program simple.
- Don’t stop with your first draft.
- Modularize. Use subroutines.
- Make the coupling between modules visible.
- Each module should do one thing well.
- Make sure every module hides something.
- Let the data structure the program.
- Don’t patch bad code — rewrite it.
- Write and test a big program in small pieces.
- Use recursive procedures for recursively-defined data structures.
Input and Output
- Test input for validity and plausibility.
- Make sure input cannot violate the limits of your program.
- Terminate input by end-of-file or marker, not by count.
- Identify bad input; recover if possible.
- Treat end of file conditions in a uniform manner.
- Make input easy to prepare and output self-explanatory.
- Use uniform input formats.
- Make input easy to proofread.
- Use free-form input when possible.
- Use self-identifying input. Allow defaults. Echo both on output.
- Localize input and output in subroutines.
- Make sure all variables are initialized before use.
- Don’t stop at one bug.
- Use debugging compilers.
- Initialize constants with DATA statements or INITIAL attributes; initialize variables with executable code.
- Watch out for off-by-one errors.
- Take care to branch the right way on equality.
- Avoid multiple exits from loops.
- Make sure your code “does nothing” gracefully.
- Test programs at their boundary values.
- Program defensively.
- 10.0 times 0.1 is hardly ever 1.0
- Don’t compare floating point numbers just for equality.
Efficiency and Instrumentation
- Make it right before you make it faster.
- Keep it right when you make it faster.
- Make it clear before you make it faster.
- Don’t sacrifice clarity for small gains in “efficiency”.
- Let your compiler do the simple optimizations.
- Don’t strain to re-use code; reorganize instead.
- Make sure special cases are truly special.
- Keep it simple to make it faster.
- Don’t diddle code to make it faster — find a better algorithm.
- Instrument your programs. Measure before making “efficiency” changes.
- Make sure comments and code agree.
- Don’t just echo the code with comments — make every comment count.
- Don’t comment bad code — rewrite it.
- Use variable names that mean something.
- Use statement labels that mean something.
- Format a program to help the reader understand it.
- Indent to show the logical structure of your program.
- Document your data layouts.
- Don’t over-comment.