Frameworks and leaky abstractions

A bit of personal history, as background, before we jump in …

In late 2009, I got it into my head that there was a web-site I wanted to build.  That web-site would be backed by a database, and would allow records in the database to be created, edited, listed, viewed, deleted and linked together in various ways.  (I won’t talk about the details now, because I want to save them for when I can link to the running site, which doesn’t exist yet.)

I’ve built a lot of conceptually similar sites in my time, mostly using Perl with HTML::Mason to embed mod_perl code fragments in HTML templates (kind of like the way PHP works).  This has entailed having some kind of layer between the application and the database to enable objects to be stored — what we now call an ORM layer, although I didn’t know that term back when I first wrote one in (probably) 2001.

And my experience with these layers is one that I’m sure will be familiar to a lot of you: I’ve built them from scratch too many times, or copied and modified an earlier version, because what I’d done for projects A, B and C was somehow not quite right for project D.  None of my ORM layers have been very good — they’ve all needed too much and too ugly configuration.  So for the late-2009 project, I finally decided to do what I should have done long ago, and that is find a third-party library that I could use — a proper ORM rather than one of the hacky just-the-bits-I-need-right now half-an-ORMs that I’d been using.

After a bit of poking around, I concluded that The Right Answer for Perl was the DBIx::Class package, which has been around since 2005, is getting frequent releases from active developers, and has enough of a community around it to have a mailing-list, an IRC channel and a public bug-tracking system.  It also has good reviews.

Sadly, as it seemed at the time (though I now look back on this with gratitude), I couldn’t get it to work for me: it’s long enough ago now that I don’t remember the details, but I do recall that the error-messages were wholly unhelpful.  No doubt I could have got it working if I’d tried harder, but happily I took a different path: I stopped to think about how I was building this site, and asked myself whether the time had come to seriously investigate one of these frameworks that non-Perl people are always talking about.  That would mean learning a new language — at first, I thought maybe Python, so I could use Zope, but I landed up choosing Ruby instead, so that I could use Rails.  As it turned out, deciding to learn Ruby was what led to my reinvigoration as a programmer, and so to this blog.  But I’ll talk about that another time.  Once I’d learned enough Ruby to feel competent in it, it was time to buy and read the Rails book, Agile Web Development with Rails, Third Edition [amazon.com, amazon.co.uk].

(Yes, there are other Rails books; but by common consent, this seems to be the canonical one.)

What the Rails book is like

First off, I should say that it looks nice: as you can see above, the Pragmatic Bookshelf people care about design, so that all their books look better than, say The Elements of Programming Style.  You can see that love of design in their free online magazine, PragPub, too.  I know it’s not a major factor in the quality of a technical book, but good first impressions always help.

I got the Rails book and Fowler’s Refactoring at about the same time, and have been reading them in parallel.  Refactoring looks like a heavier, more academic read, and also has the big, dull-looking catalogue of refactorings sprawing across the middle half of the book, so I expected that one to be the harder going of the two.  I’ve been surprised to find that it ain’t so: while Refactoring turns out to be pleasantly light and digestible bedtime reading, the Rails book demands much more concentration.  (I finished Refactoring a few nights ago, and now feel better about the blogs that I wrote about it, as those last few chapters didn’t change my mind about any of the important issues.)

Writing about the Rails book so soon after the The C Programming Language, I can hardly help but be very aware of the huge difference in tone.  Whereas Kernighan and Ritchie’s classic is gently austere, the Rails book is distinctly jocular — it’s a book that wants to be your friend, and sometimes comes across as a little too needy (as for example in the overuse of the phrase “it turns out that …”, which at times seems to prefix every other statement of fact).

Stylistic bouquets and brickbats out of the way, does the book get the job done?  I think so.  I’ve read the first fourteen chapters, which constitute the tutorial — the remainder of the book is reference material — and I have read them carefully and in detail.  Those chapters walk the reader step-by-step through the creation of a basic store site with a catalogue, AJAX-enabled shopping cart, administrative UI and RESTful access to items, plus internationalisation and testing.  That is an awful lot to get through, and it’s to the book’s credit that it does so in only 250 pages.  They are dense pages, though: despite the jocularity, there is a lot of code, and it’s in the nature of Ruby code that it packs in a lot of semantics per unit syntax, so you do need to pay attention.

The way I’ve been tackling the book is that I first read each chapter through carefully, away from the computer; then when I’ve finished, I’ll go through it again with the computer, typing in all the code, running it, checking that it works, and where necessary making the very small changes needed to run correctly on the current version of Rails.  Yes, it’s a slow process — that’s why it’s taken me more than a month — but hopefully the result is that I’ve come through it with a pretty solid understanding.

… but I don’t feel like I understand it solidly.  I don’t think that’s the book’s fault: there is just so darned much of Rails that if it had stopped to explain everything in detail as it’s encountered the first time, those 250 pages would have been 1000 or more.  To be fair, the book knows and acknowledges this: for example, on page 65 it says:

From your application’s top-level directory, type the following magic incantation at a command prompt.  (It’s magic, because you don’t really need to know what it’s doing quite yet.  You’ll find out later.)

depot> rake db:migrate

Nevertheless, the upshot is that I feel that I am surfing on the edge of comprehension: I do understand the code I’m looking at, but I have no confidence that I could change it much based on what I’ve learned so far.

What Rails itself is like

I think that my not feeling secure with Rails is not the fault of the book, but of Rails itself — or, more precisely, of frameworks.  More precisely still, I suppose I shouldn’t call it the fault of frameworks, but just their nature; just as I wouldn’t say that lion is “at fault” for killing and eating a nice cuddly zebra.  (And not only because zebras are in fact psychotic.)  It’s a lion; it’s doing what lions do.  And Rails is a framework; it’s doing what frameworks do, which is to try to hide a lot of complexity.

Rails does a lot for you.  It manages (among many other things) your object-relational mapping, with conventions for naming, the mapping from URLs to actions on controllers, the mapping from there to views, unit testing and functional testing, provision of fixtures for the tests, generation of HTML and XML from templates, and so on.  The Rails mantra “convention over configuration” makes much of this painless, so that (for example) if you have a “checkout” action in your “store” controller, the HTML template views/store/checkout.html.erb is used without your needing to do anything to make it so.

One such implicit association is between test fixtures and database tables.  A fixture is a set of records that are added to the test database before running a test script, and like so much else in Rails, they Just Work.  So for example, if your test script says fixtures :users, then the fixture file test/fixtures/users.yml will be parsed as YAML and each record in that file will be added to the “users” table in the database.  There’s no need to say the kind of thing you’d say in other web-app frameworks:

fixture :users => “users.yml”, “users”

Meaning that the fixture whose name within the test script is “users” should be loaded from the file called “users.yml” and its records added to the “users” table.

So this is good, right?

Oh, mama.

Those leaky abstractions I promised you

Yes!  It’s good!

Right up until it isn’t what you want, and then things get ugly.

There is a pleasingly simple example of this on pp. 245-246 of the Rails book, on fixture filenames.  (How very convenient!  The very thing we were just discussing!)  Here, we are building a performance test for the speed of adding products to a cart, and we need to make a fixture that contains a thousand dummy product records.  Because those product records are going to go into a “products” table, the fixture file is called “products.yml”, but — uh-oh! — we already have a “products.yml” fixture which we’re using for unit-testing the Product model class.  What to do?

The solution that the book teaches you, which I assume is the best one available, is not pretty.  There’s no way to change how Rails maps a fixture’s name to a table name, so your file has to be called products.yml; which means all you can do is put it in a different directory.  So the solution is to make a fixtures/performance subdirectory and put the new products.yml in there.  And then *cough* *cough* in your test-script, you say:

self.fixture_path = File.join(File.dirname(__FILE__), “../fixtures/performance”)
fixtures :products

We’ve come a very long way  from just saying fixtures :products, haven’t we?

Now it would not be hard for Rails to fix this particular case.  It might introduce a new fixture_path method, so that you could just say

fixture_path “performance”
fixtures :products

Or it could extend the existing fixtures method so that you could optionally specify the fixture filename, like this:

fixtures :products => “performances/products.yml”

Or indeed:

fixtures :products => “products2.yml”

Which dispenses with the subdirectory and lets you give the file a different name.

But of course this is only one of many, many such cases where the default assumptions that Rails makes for you might turn out to be not what you want.  And every time a new feature is added to deal with corner-cases like this one, the simplicity and DWIMminess of Rails, which is why we wanted to use it in the first place, is diminished.

The term “leaky abstraction” is, as far as I know, due to Joel Spolsky, who coined it in his classic blog post The Law of Leaky Abstractions.  That post contains (in bold font no less) the claim that “All non-trivial abstractions, to some degree, are leaky”.  That’s actually an overstatement: it’s not hard to think of counter-examples, such as Unix pipes, Ruby Bignums, and Strings in just about any language other than C++.  It would be fairer, then, to say something like this (which I am going to put in bold, too):

Good design in computer programming consists of inventing abstractions that don’t leak.  Good programming consists of implementing those abstractions in such a way that they don’t leak.

You can quote me on that.

By the way, I notice that this raises the question of whether it’s a faulty abstraction that leaks, a faulty implementation that makes it leak, or both.  I’ve got no clear conclusion on that yet, since I only just thought of the question.  Feel free to enlighten me.

So does this mean that Rails sucks?

No, that’s not what I meant to say at all!  Rails is famously learnable, and my limited experience so far confirms that it’s worthy of that good renown.  Still, it is a framework.  And what is a framework, if not a huge bundle of interacting abstractions?  It seems inevitable that some of them are going to leak.  The radius of comprehension is already high in any framework (you can’t write a Rails controller without also knowing about models and views and a bunch of auxiliary concepts), and every leak makes that radius greater.

So far, I am ambivalent about Rails.  It seems that it’s about as good as a web-app framework can be in terms of learnability and comprehensibility (and it’s surprisingly good on performance, too).  But I still feel too much as though I have to sacrifice my first-born sonautonomy in order to play its game.  Even with Rails, I have to learn a whole bunch of concepts before I can get started, and then pray that the edges are going to remain as clearly defined as they now appear once I start doing Real Work.  I’ll get back to you in a couple of months and let you know how that’s going.

All I can tell you for now is that I am cautiously optimistic, but wary.  I’m not confident that, for example, when I start doing the bit of my application where I stitch together a tree of the objects in my database, and have to explain to my model that each object has a link to another object of the same type, that’s a concept that’s going to fit neatly into the clean set of abstractions that Rails has given me.

Fingers crossed that they don’t start leaking.

About these ads

18 responses to “Frameworks and leaky abstractions

  1. I totally agree. The c2.com crows calls this the 80/20 rule of software (http://c2.com/cgi-bin/wiki?EightyTwentyRule), and that has always been my experience as well.

    I find it especially interesting to hear you say it in the rails context, because that has always been my assumption/suspicion, even though I for that very reason never actually tried using rails. That might be an arrogant mistake, but I just feel so annoyed whenever I have to spend half an hour digging through framework documentation trying to do something that I know how to do with ease when not working through the framework.

    Of course, there’s an element of lazyness here as well. Learning a new framework is indeed hard. It might even be that it’s harder than learning the actual language. I find it’s hard for me to find the motivation to do that, since frameworks doesn’t really enable me to do anything new. They’re just productivity tools, and I don’t have any clue how much more productive I’ll be until I try it.

  2. I feel obligated, as a webdev Pythonista who’s been doing suspiciously too much Ruby lately, to tout my experience with Django, the Rails equivalent for Python.

    When I first started using MVC frameworks a few years ago, I went from my terrible hand-rolled PHP to Symfony, a nice PHP Rails clone. From there, my job required me to pick up Rails, and I found it wholely unpleasant. Admittedly, I was not in “the right place”, mentally, to accept the Rails Way(tm), and the new Rails 3 has Merb influence, so I am certainly going to give it another try soon.

    The point I almost forgot I was making, however, is that very soon after, I started doing Django work at my other job, and fell in love. Rather than Rails’ convention over configuration, Django explicitly states most things, allowing someone new to the framework (or the code base) to easily trace the program execution flow. This also removes a whole host of bugs that come from the magic frameworks have a tendency to include. (Django actually had a magic-removal branch prior to its 1.0 release.)

    Now, as a full-stack framework, Django is most certainly not perfect, and the magic it does include can be a real PITA on occasion. I feel that it’s a good example, however, of how frameworks don’t have to be as painful as many people find them.

    Oh, and it has an unusually low number of auto-generated files when you create a project, making it easy to implement sections as you go, rather than all upfront. :)

  3. Brian Morearty

    When you do start doing the bit of your application where you stitch together a tree of the objects in your database, take a look at awesome_nested_set. It’s great. http://github.com/collectiveidea/awesome_nested_set

  4. To be honest, I expected someone like you to sport Merb (no, it’s not dead, far from it), or Sinatra.

  5. Xiong, it’s interesting that you seem to be evaluating frameworks on the basis of how little magic they do, which more or less equates to how little functionality they give you (and how much you have to do for yourself). In other words, you prefer control over convenience. I can easily sympathise with that.

    Brian, thanks for the pointer to awesome_nested_set. I looked into it, but I don’t think it’s going to help me because its use of magically named lft and rgt fields indicates that it’s specifically for binary trees, whereas I need, in general, each of my nodes to have between zero and infinity children.

    Chris, I have never come across either Merb or Sinatra. What is it about them that you thought would appeal to me specifically?

  6. Hey Mike,

    Can you expand briefly on your side note about zebras being psychotic? I missed whatever it was that you meant here.

    (A brief Google session led me to a few interesting places, but none that left me thinking it was what you had in mind.)

  7. The observation on zebras is based on my experience feeding one from a car in a safari park in Texas several years ago. I looked into its eyes — no, though its eyes, into its soul — and saw that it was irremedially insane. Then it ripped the zebra-food bag out of my hand and ate the whole lot. As we drove on, I remember being grateful that it had eaten the food rather than the hand that held it.

    So, zebras: they may look cute at a distance, but don’t be fooled.

  8. Mike: The nested set model is not limited to binary trees. Take a look at http://dev.mysql.com/tech-resources/articles/hierarchical-data.html for an explanation of the model.

  9. I wouldn’t call that a leaky abstraction. There’s an easy way to configure the convention.

    #load the data from carts.yml and lotsa_products.yml
    fixtures :roles, :carts, :lotsa_products

    #set the class to use for lotsa_products.yml
    set_fixture_class :lotsa_products => “Products”

  10. Thanks, Emil, that’s a useful link. I see that I jumped to conclusions on what lft and rght are for; perhaps awesome_nested_set will be useful to me after all.

    By the way, another component I’m going to need is user registration and validation before they’re allowed to maintain the data. I know how to build that myself in Rails, but no doubt it’s been done many times before. Does anyone know whether there’s a The Right Answer to how to do this, ideally embodied in an It Just Works plugin?

    And thanks, mattmc, for your solutions to the fixture-name problem — nicer than what the book recommends! Still, I hope you appreciate the broader point that this particular problem was illustrating.

  11. Huh…I just realized I misread your issue. I was thinking you were using a class name that didn’t match the table name.

    Just put the special products for the cart test in a cart subdirectory. I think what you are actually looking for is:

    Fixture.create_fixtures(‘test/fixtures/cart’, ‘products.yml’)

    This also takes a class name hash…

  12. I have to disagree about Bignums and Strings not being leaky. (I haven’t thought about an argument for pipes yet)

    Bignums are leaky when you try to optimize them in a compiler. There are optimizations done based on type inference. Now, if you had only ints to deal with it would be easy. But of course, when the num would overflow into a larger representation it all breaks down.

    Strings are leaky in what Joel Spolky describes as Shlemiel the Painter’s problem: http://www.joelonsoftware.com/articles/fog0000000319.html

    If Strings weren’t leaky, we wouldn’t need StringBuilders.

    There is a common pattern as you can see. Both leaks appear in optimizations. That is because when you optimize something you have to go one level down the abstraction level.

    Actually, it’s not just optimizations. It’s any case when you are not satisfied with the abstraction and you want to do something differently. Optimization is just a special case. But there is always a point when you want an abstraction to be different, thus I think it’s impossible to have abstractions that don’t leak, like you said, and Joel’s axioms stands:

    All abstractions are leaky.

    Perhaps the goal should be to have abstractions with as little leaks as possible. Or maybe even provide a way to workaround the leaks and go down to the level you need to do. I don’t know if this is possible. If it’s not, than you need a trade-off. But that is another rule by which all engineers live, so I guess there’s nothing new.

  13. Merb was dubbed the “framework for hackers”, and was born as a way to handle concurrent file uploads in rails. It’s grown a lot, and I like to call it “better rails”. It’s more modular, lighter, simpler, faster… You get it. Merb is currently being merged into Rails for what is to be known as Rails 3.0. However, a lot of us Merbists won’t migrate to Rails when this is done, but will continue to use whatever is to be known as Merb 2.0.

    Sinatra is another step down from magic. It’s a simple DSL to write web applications in Ruby… Probably as low as you can go before sitting right on Rack. There is probably no magic to speak of, although you can make Sinatra apps fairly complex… but why would you do that?

  14. Pingback: Page not found « The Reinvigorated Programmer

  15. Steffen Beyer

    Regarding user registration and authentication — still learning Rails myself, but as there are no answers yet…

    The first plugin to do this Quite Right was Restful Authentication[1], which generates a lot of boilerplate code in your app. Easy to tweak, but expensive to maintain. (Or not?)

    Somewhat newer is Authlogic[2]. Takes a bit more time to get you started, but once you got it running, you already know the plugin a little bit — which can’t be too bad for handling user authentication.

    [1] http://github.com/technoweenie/restful-authentication
    [2] http://github.com/binarylogic/authlogic

    [Mike says: thanks, Steffen, I'll look into these.]

  16. Devdas Bhagat

    All abstractions leak when they fail.

    Pipes leak when you need to change things. They leak when you need to introduce error handling.

  17. Pingback: Git is a Harrier Jump Jet. And not in a good way « The Reinvigorated Programmer

  18. Pingback: Programming Books, part 3: Programming the Commodore 64 | The Reinvigorated Programmer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s