Matt Wedel is constantly telling me I need to read Frank Herbert’s classic sci-fi epic Dune. I’ve never been keen because of the vast number of sequels, but I finally gave in to his repeated requests and started on it last night, on my Kindle.
I got as far as page 4. Since the Kindle shows small pages, I guess that’s part way down page 2 of a printed copy. Here’s why:
Yes, Paul. What is a gom jabbar?
Let’s start by thinking about a very simple example. I’ve recently switched to using Ruby as my language of choice, after a decade as a Perl hacker. Ruby does a lot of things more nicely than Perl, including having proper object syntax, simple everything-is-an-object semantics, a sweet module/mixin scheme and very easy-to-use closures, so I’ve mostly been very happy with the switch. In so far as Ruby is a better Perl, I don’t see myself ever writing a new program in Perl again, except where commercial considerations demand it.
Today, I want to talk about a problem that I don’t have a good solution for, and throw the floor open in the hope that someone else does. Teach me, O commenters.
Suppose your program has to build a tree structure — as the result of parsing a query, say. And suppose that, having built the tree, you want to do several different operations that involve walking that tree. How do you design that program?
To make it concrete, consider my (rather aged) package CQL-Java, which — get ready for a big surprise — provides a CQL parser in Java. CQL is a conceptually simple but very precise and expressive query language used in information retrieval, but for our current purposes all we need to know is that it supports structured boolean queries like these:
kernighan and ritchie
(kernighan and ritchie) or fowler
kernighan and (ritchie or pike)
Our task is to parse such queries, and to be able to render them out either in an XML format called XCQL or in Index Data’s ugly but functional Prefix Query Format
I’m delighted to announce that issue 10 of PragPub is out today, and contains an article by me, Tangled Up in Tools. For those who like nice typesetting, you can download the whole issue as a PDF, or as epub or mobi (whatever they may be).
PragPub is the monthly programming magazine published by the Pragmatic Bookshelf — that’s the publisher that brought you the definitive books on Ruby and on Rails, among much else. I like their stuff, and I am pleased to be in their magazine. Not only that, they let me illustrate the article with photos of — oh, but you’ll see for yourself when you read it.
I’ve just read Clay Shirky’s new article, The Collapse of Complex Business Models, which is in turn based on Joseph Tainter’s 1990 book The Collapse of Complex Societies [amazon.com, amazon.co.uk]. Shirky summarises Tainter’s analysis of why ancient cultures such as the Romans and Maya collapsed so catastrophically:
A group of people, through a combination of social organization and environmental luck, finds itself with a surplus of resources. Managing this surplus makes society more complex — agriculture rewards mathematical skill, granaries require new forms of construction, and so on. Early on, the marginal value of this complexity is positive — each additional bit of complexity more than pays for itself in improved output — but over time, the law of diminishing returns reduces the marginal value, until it disappears completely. At this point, any additional complexity is pure cost.
Tainter’s thesis is that when society’s elite members add one layer of bureaucracy or demand one tribute too many, they end up extracting all the value from their environment. [...] Complex societies collapse because, when some stress comes, those societies have become too inflexible to respond. [...] In such systems, there is no way to make things a little bit simpler — the whole edifice becomes a huge, interlocking system not readily amenable to change.
When the value of complexity turns negative, a society plagued by an inability to react remains as complex as ever, right up to the moment where it becomes suddenly and dramatically simpler, which is to say right up to the moment of collapse. Collapse is simply the last remaining method of simplification.
Stop me if this seems too obvious to be worth saying, but isn’t this exactly what happens to big programs? “When society’s elite members add one layer of bureaucracy or demand one tribute too many” sounds disturbingly like “When the framework introduces a notion of a connection factory manager builder”.
This is an immediate followup, about twelve hours after I posted the original What is “simplicity” in programming?, because the excellent comments on that post have pointed me to another insight. In particular, Chris pointed out that “Ideally, you would not have to read the code of all the methods as the name should tell you by himself what is it doing.”
I think Chris has put his finger on an area where individual temperament, preference and aptitude is very important. Probably, skipping over the small methods is the right thing for at least some programs — the ones that have been extensively Fowlered. But it doesn’t come naturally to me at all. It makes me nervous. I actually feel physically uncomfortable about working with code that I’ve not read. So maybe that’s why I am happier with one larger method than several smaller ones sprinkled across various classes and files. (To be clear, I am not advocating big functions and classes — no-one wants to read 300 lines in a single method, or classes with 50 methods. I’m in favour of biggER functions and classes than Martin Fowler is, that’s all.)
We all agree that we want our programs to be simple: in the memorable words attributed to Einstein, “as simple as possible, but no simpler”. Kernighan and Pike’s most recent book, 1999’s The Practice of Programming [amazon.com, amazon.co.uk], has three key words written on a whiteboard on its cover: Simplicity, Clarity, Generality; and the greatest of these is Simplicity.
But what is “simple”?