Common bugs and why exercises matter (binary search part 2)

Many, many thanks to all of you who have contributed to the astonishing response to the previous article on the difficulty of getting binary search right.  I’ve never seen anything like this: in just over 24 hours since I posted the article, 541 comments have been posted, and they’re still rolling in.  (That number is a bit inflated by people re-posting the same code-samples multiple times after seeing WordPress mangle their source, but even taking that into account it still amazes me.)

You guys rule.  Seriously.

I have a lot to say in response to it all.  Read on …

The lameness of the WordPress commenting system

First up, an apology to all of you who posted code that got mangled.  As several commenters pointed out, WordPress comments are not really a great medium for this kind of challenge: their tendency to eat less-than characters, their lack of a Preview facility and the inability to edit a comment once it’s been posted combine to make the post-a-code-fragment process rather more of an adventure than it ought to be.  (That said, I remain astonished at how many people used curly brackets around their {source} tags even though my instructions clearly said in bold to use square brackets.  Come on, folks — read the article before submitting your code!)

Anyway, the next time I do something like this, I will be sure to include very, very explicit instructions right at the start, and to include links to various code-fragment hosting services such as, and

In the mean time, see the WordPress Support article on Posting Source Code.

Different kinds of comments

It’s been fascinating to see how different the comments have been in the three places that the binary search article was discussed.  Here on The Reinvigorated Programmer, the comment thread was massively dominated by attempts to meet the challenge.  The comments on Hacker News were primarily discussions of the problem, of how it applies in real development situations, of what the gotchas are, and of how best to guard against them.  The Reddit comments were rather more mystifying — many redditors seemed to feel that the idea of actually coding anything was rather beneath them, and were rather affronted that anyone might suggest such a thing; others found the idea of coding a binary search “elitist”.

Of course there was plenty of variation in all three venues (for example, this guy on Reddit made a serious attempt to solve the problem, then fessed up when he realised his routine was buggy).  But that’s the overall impression I got.  Sorry, redditors — I still love Reddit, honest!  By the way, Reddit had a very ambivalent response to this article, as it often seems to with what I post here: at the time of writing, Reddit scores the binary-search challenge with 44 points, but that is the result of 121 upvotes and 77 downvotes, so only 60% of the votes were positive.  At the risk of alienating redditors, I can’t help wondering whether some of the negativity comes from insecurity — programmers who know they can’t write low-level code and prefer not to be reminded of it.  I probably shouldn’t have said that.  Hmm.  Must remember to go back afterwards and delete it.

Two of the more interesting areas of commenting (both of which I disagree with) were that you don’t need to write this routine because it’s already in the libraries; and that coding without testing as you go is nonsensical.  I’ll address the first of those in more detail below; and I’ll talk about testing next time, to avoid making this article too long.


At this stage I’d hoped to unveil a simple analysis of the form “14 people attempted the challenge, of whom 6 reported bugs in their own code and 8 claimed success, but I found bugs in 5 of those for a total success rate of 3/14 = 21% which means Reinvigorated Programmer readers are 2.1 times as clever as IBM and Bell Labs professionals”.

That neat idea went out the window for three reasons:

  • The good: there were 557 submissions instead of 14, in maybe fifteen different languages, and there was no way I could test them all.
  • The bad: because I produced the Jon Bentley quote from Programming Pearls before stating the rules of the challenge, a lot of people eagerly ploughed straight in, and so inadvertently broke the rules.  My bad.
  • The ugly: WordPress mangled the source code of many of the submissions.

So I am not even going to attempt to come up with numbers based on the actual data.

That said, I have read all 562 comments, and gathered at least a general sense of the success rate, and I’d say that based on people’s self-reporting, about half of the binary searches were correct.  (Kudos, by the way, to all those of you who were honest enough to report failure; and, for that matter, to those of you who achieved success.)

If that 50% figure is anywhere near accurate, then this group of blog-readers are well ahead of the averages.  But bear in mind that probably some people failed but didn’t report that fact; and for sure plenty of people who reported success had simply not found bugs in their programs.  For example, Eric Burnett claimed in a comment that he’d found bugs in nine solutions reported as correct.  I’m sure that plenty more “correct” solutions still harbour bugs, and I encourage you all to go and hunt down each others’ mistakes!  In particular, finsprings, Aaron, Mike J, Langtree, Luke, Marcus, cycojesus, Erik Swanson and donaq (the nine people whose code Eric challenged) might want to see if they can find a mistake in Eric’s own solution!

Those of you who want to find bugs in other people’s solutions — or indeed assure yourselves that your own solutions are as good as you think they are — will find it useful to take advantage of Steve Witham’s set of 4096 test cases.  Many thanks to Steve to providing these.

Anyway, even allowing the various biasing factors, it seems possible — maybe even likely — that our hit rate is better than 10%.

Common bugs

It’s been interesting to see that many of the programs that had bugs were going wrong in the same ways.  Common bugs included:

  • Not dealing correctly with a zero-element array
  • Not dealing correctly with a one-element array
  • Not finding a value in the last element of the array
  • Not dealing with failed searches (the sought element is not there)
  • Not coping with repeated elements in the array
  • Not omitting the midpoint from the subranges when narrowing the range after a probe.  (This  results in an infinite loop for some inputs.)
  • Pointing to the first and last members of the range but coding as though the last-pointer variable pointed to the element after the range (or vice versa).

(And, yes, not dealing with a zero-element array is a bug, for those commenters who asked.  Zero is a perfectly good number of elements for an array to have, and an array search that fails for empty arrays is no more acceptable than an addition operator that fails if you add zero.  A few commenters also asked whether it’s OK to ignore the case when a null Array reference is passed in: yes, of course — that would violate the precondition that the search is in a pre-sorted array, so all bets are off.  No array at all is completely different from an empty array.)

The interesting thing about these common failure modes is that they are all so predictable.  If, before starting to code, we’d each taken two minutes to write down the ways in which we could imagine our programs failing, I bet we’d all have listed most if not all of these.  Yet even knowing that, we find them hard to avoid.

In light of the list of common bugs, we can easily see what kinds of test cases we need to include in the suite: empty arrays, single-element arrays, arrays where the sought element is absent, or appears only at the end, or only at the start, or multiple times; and of course a healthy does of randomly generated sorted arrays.

Looking at the list of bugs, I find myself thinking of the rules in Kernighan and Pike’s Elements of Programming Style.  Back when I reviewed that book, and listed the rules that are gathered at the end of the article, some commenters felt that they were so obvious, so trivial, as to be worthless or even insulting.  And yet most of the mistakes listed above are covered in K&P’s list.  A few of the relevant rules are:

  • “Write clearly — don’t be too clever.”
  • “Make sure your code “does nothing” gracefully.”
  • “Watch out for off-by-one errors.”
  • “Take care to branch the right way on equality.”
  • “Make sure special cases are truly special.”
  • “Choose a data representation that makes your program simple.”

Then there is the rule that the I didn’t let you follow literally, but the spirit of which should pervade our design process: “Test programs at their boundary values”.  And the very general but very relevant rule “Don’t stop with your first draft.”  And of course I must mention my favourite:

“Say what you mean, simply and directly.”

Why does this exercise matter?

A lot of the comments on Reddit (and some here) were along the lines that it’s dumb to write a binary search routine, and that we should just use one from a library.  For example, Michael Eriksson wrote that “Trying to write an own sorting/searching/whatnot algorithm is a beginner’s error (in and by it self). There are well-tested and highly performant libraries for such basic functions available in any established high-level language”.

That seems wrongheaded to me.  A boxer never has to skip rope in the ring, but it’s an important part of his training regime.  A concert pianist never has to play D-flat major, four octaves, hands together, ascending and descending, as part of a performance; but he doesn’t for that reason neglect the exercise.  An olympic sprinter never lifts weights during the 100 meters final, but he certainly does as part of his preparation.  Why would we think that in programming we don’t need to do exercises that are similarly related to our day-to-day work?

I have a hypothesis about that, but it’s not one that’s going to be popular.  The boxer, the concert pianist and the sprinter need to be at the absolute top of their game in order to succeed.  If the boxer’s not light on his feet, he’ll get beaten up; if the pianist lacks dexterity, he simply won’t get booked, in such a competitive career; the sprinter deals in margins of hundredths of a second.  They practice, exercise, do training drills because they must: if they fall to, say, 97% of their best performance, they lose.  Could it be that programming is a little too comfortable?  Do employers expect too little?  Are we content just to stay some way ahead of the pack rather than striving to excel?  That’ll work if you’re happy to write Enterprise Beans For The Enterprise for the rest of your career.  Not so much if you’re hoping to go and work for Google.

So the primary reason for writing a binary search routine is not because we need to write binary search routines.  It’s because we need the mental exercise, the discipline, the pleasingly firm, cool feel of solid code beneath our fingers.  The ability to do a binary search right first time probably correlates pretty well with the ability to write more complex code correctly.  It builds up the right brain-muscles.

That’s the primary reason.  The secondary reason is because actually sometimes you do need to write a binary search, and the library routines won’t get the job done.  Or if they will, they’re grotesquely inefficient.  For example, suppose you have a 64-bit integer, and you need to find out whether it’s among the nine billion 64-bit integers that are stored in ascending order in a 72 Gb file.  The naive solution is to read the file into memory, making an array (or, heaven help us, an Array) of nine billion elements, then invoke the library search function.  And of course that just plain won’t work — the array won’t fit in memory.

Or consider a less extreme, and therefore more subtle, case: the file to be searched has only half a billion numbers in it (so it’s a 4 Gb file).  That will fit in memory in some modern computers, depending on how space-inefficient the in-memory representation is.  But the result of using the naive approach is that you read 4 Gb from disk every time you need to do the search.  Whereas if you could implement binary search directly against the disk file you’d only need to do log(2) (half a billion) = 29 probes, of which the last 10 will all be in the same 8 Kb disk block.  So you’ll only do 20 seek-and-read pairs instead of a seek and half a million reads.

No, this stuff doesn’t happen every day.  But situations do crop up when someone needs to know some actual, you know, computer science.  And when that happens, don’t we want to be the ones who can come swooping in and save the day?

Next time, I’ll be addressing all the howls of outrage about Rule 6 (“NO TESTING until after you’ve decided your program is correct”), and trying to escape the pestilence of “test-driven development”.  Stay tuned!

Update (the next day)

If you usually just read the articles and skip the comments (and who could blame you in the case of the original binary-search challenge post!), then I urge you at least to read this comment by Antti — it’s solid gold (even if it does rather trespass on the stuff I was saving up for my own next post, bah!)

Update 2: links to this whole series


69 responses to “Common bugs and why exercises matter (binary search part 2)

  1. Interesting the common bugs you pointed out. Many of those were ones I specifically checked my code would handle for before I tried testing it. I know where my mistakes tend to happen. The overflow problem with the addition (a+b)/2 is not a problem in python, but I’ve been bitten too many times by it in C, C++ so that too I would have been avoided .

    Back in the old, old days this was how you developed code. You didn’t have access to abundant computer resources, and couldn’t do all those iterative test cycles, so you examined your code before each compile/ run attempt.

  2. bey0ndy0nder

    yeah. That’s all good points. I don’t know why I’m saying this, but if I were doing this for real I would have written tests cases that did boundary value analysis, and also I would have done some white box testing things. Instead I simply generated some random numbers (what does that really test?). I also did not think about overflow. This is something I usually think about.

  3. Cheers to you, Mike!

    I absolutely love the analogy to boxers, pianists and sprinters. Thanks for encouraging and emphasizing the importance of this sort of mental fitness–your quip about Enterprise Beans is BRILLIANT, and I hope your hypothesis turns out to be more popular than you think.

  4. I like the comment about “Enterprise Beans for the Enterprise”. I always said I didn’t want to work for a “We Synergize E-business Solutions” company. Slightly different (company vs position) but related.

    Right now I am working (not quite as bad as Enterprise Beans for the Enterprise, but not really CS either) while also doing CS grad school work. I wish someone would pay me to go to school, because that is way more interesting than work. I don’t know if this means I haven’t found the right job yet, or if it means that the right job is “CS professor”, or what.

  5. I checked Eric’s code even though I don’t know Python. The logic is sound. So if it compiles, it’s good. Only real worry was making sure that ‘mid’ never equals ‘end’ since ‘end’ can be outside the range of the array and Eric accesses ‘mid’. But ‘mid’ will never equal ‘end’ according to the equation he uses.

    I too find the list of common bugs interesting. Does this mean we find most bugs at runtime and NOT when we actually code something up? More specifically, that everything we write has a 90% chance of having bugs in it?

  6. I completely agree. Standard libraries are a bad excuse not to learn these basic techniques. I am studying for a bachelor’s degreen in software engineering in Finland and we are taught the theory of various search algorithms and encouraged to try and code our own implementations. Actually we have an exercise in which you are expected to make your own implementation of a hash table and an efficient string search, which I should be right now (due to next friday).
    Regarding also your earlier posts, some of my class really enjoyed a course in embedded system we had last year, because with them you can and need to know the hardware thoroughly and (at least with the development environment we used) there isn’t an abundance of libraries, so everything is up to you.

  7. To give a somewhat more practical example, we have our own implementation of Quicksort in our code. Why? Because a common situation is that we only need the first x elements of the list that is being sorted, and you can tweak Quicksort such that it stops sorting after the first x elements are in order (and then maybe resume sorting later on). So we save on sorting time.

    Another reason to really know algorithms by heart is that is some cases some algorithms that are equivalent in their worst case time complexity perform hugely different for certain working sets. If you are neither aware of the different algorithms, nor of their tradeoffs, you won’t even recognise the situation.

  8. I just tried the test cases you linked and they do all pass for my code :-)

  9. Depending on what you’re working on, you might run into situations where you need an algorithm with a fairly uncommon requirement (say, a sort that is linear-time on nearly sorted sets) that can’t be found in a library for your language and platform of choice. You might need to implement an algorithm described in a research paper that hasn’t got a good open source implementation yet. You might even need to design a special-purpose algorithm. There are many reasons why you would have to write your own algorithms.

    About the reasons for failure in writing such an algorithm, I had an interesting thought while making my entry earlier, regarding the craft of programming. Here’s a half-assed analogy:

    If you asked an artisan woodworker to quickly whip up a wooden box out of some materials, what would the box be like?

    Would it have haphazardly placed nails all over the place, be structurally unsound, and have unused drill holes here and there because he just drilled them absentmindedly while “sketching” the box?

    Would he have to make ten iterations before getting a simple box right?

    No, of course not. He is an artisan woodworker. He knows how to do things right. He has a methodology. He knows to measure twice before cutting. He knows the right way to attach two boards together so they don’t break easily. He knows where and in which angle to place nails so they give the best support. He knows the type of glue he might use, the characteristics of the material he’s working with, the tools he has. He knows the best way to draw up a design on paper, which lets him make clear and simple calculations for the sizes of the pieces he needs.

    He does not waste time or materials on failures. He gets it right the first time. He knows his work doesn’t have “bugs” because he’s followed his methodology and done things right.

    In software, you have automated testing. Automated testing is wonderful, and is essential for high-quality software, as software tends to have a much higher order of complexity. Yet testing is not perfect; it doesn’t reveal all bugs. Tests are written by people. Tests rarely cover every possibility, even with good code coverage.

    To make code as bug-free as possible, in addition to tests, you need a good methodology when writing code, to make the right choices so that problems remain as simple as possible down the line, so that edge cases are dealt with promptly and with clarity, so that code remains as easy to reason about as possible. No extra drill holes, no haphazard nails.

    I think the binary search problem provided an excellent opportunity to discuss the kind of methods you might use for this, and I’ll give one example here:

    When processing sequences (array, list, et cetera), always use half-open ranges, where the first index is the first element in the range, and the second index is one past the last element in the range. This has many advantages down the line and makes many operations very simple. Splitting one range in two means simply picking a single pivot and using that in both of the new ranges: [x, y) -> [x, pivot), [pivot, y). An empty range is one where both indices are the same: [x, x). The count of elements is a single subtraction: y – x.

    Methods like this help make software robust and I think they are an essential part of the craft of programming. Thoughts?

  10. In fairness, python’s library routines would work perfectly well for your extreme cases, as they aren’t tied to a particular list class. This may not be the case for all languages, but I doubt it’s the only one.

  11. Carey, that’s interesting. I guess I can imagine how the API might look for arbitrary calling back, but it would be great if you could sketch some sample code for the search-in-a-sorted-file problem, using Python’s library routine for the binary search. Don’t forgot to use SQUARE BRACKETS instead of curlies around your {source} and {/source}!

  12. I don’t disgaree with what you are saying, in my opinion, a _good_ programmer should be able to solve technical, algorithmic problems. If you can’t, then it doesn’t make you a bad programmer, but it probably makes you a lesser programmer when compared to the guy who can.

    Of course there is a parallel discussion to be had here about what constitutes a good developer, which is not necessarily the same thing as a good programmer and probably includes phrases such as code quality and business value.

    And yes, well done, bravo to you, clap clap – you will get more geek points if you conjur up an implementation of an algorithm without ever running it or testing it (TDD doesn’t even come into it IMO). But it is definatelty not a test of how much of a good computer _scientist_ you are, quite frankly, it’s a pretty crappy way to appraoch a solution to a problem.

  13. Something along the lines of this should work:

    class FileList(object):
      Untested dumb view of a record-structured file.  Don't use this in anger.
      def __init__(self, filename, record_size, constructor=int):
        self.filename = filename
        self.record_size = record_size
        self.constructor = constructor
        assert record_size > 0
      def __getitem__(self, index):
        file = open(self.filename) * self.record_size)
        record =
        if len(record) != self.record_size:
          raise IndexError
        return self.constructor(record)
      def __len__(self):
        file = open(self.filename), 2) # seek to end of file
        return file.tell() / self.record_size
    import struct, bisect
    huge_list = FileList('integers', 4, lambda record: struct.unpack('i', record))
    bisect.bisect(huge_list, 1234567890)
  14. “There is a parallel discussion to be had here about what constitutes a good developer, which is not necessarily the same thing as a good programmer.” Excellent point, Steve. That distinction was hovering around somewhere in the back of my mind, but I’d not seen it clearly until you stated it this way.

  15. So bisect.bisect() expects its first argument to be an object that knows how to do __getitem__() and __len__() and that’s all, right? Makes sense.

    I guess you’d want to move your file = open(self.filename) statement into the __init__() method and save the result as self.file, otherwise you’ll be opening and closing the file log(n)+1 times.

    (By the way, although I generally prefer dynamic over static typing, it continues to appal me that neither Ruby nor Perl, nor AFAIK Python, has any way at all to talk about types here — something like Java interfaces would be a huge boon.)

  16. @Steve
    “And yes, well done, bravo to you, clap clap – you will get more geek points if you conjur up an implementation of an algorithm without ever running it or testing it (TDD doesn’t even come into it IMO). But it is definatelty not a test of how much of a good computer _scientist_ you are, quite frankly, it’s a pretty crappy way to appraoch a solution to a problem.”

    What’s that sarcasm for? We already noticed, that [b]YOU[/b] can not or are not willing to see the point in this.

    No one said that it’s of any business value to *desperately* try to write bug-free code in the first shot. But even you, spilling buzz-words like “business value” and “code quality”, or maybe especially you (for spilling those buzz-words) should be able to appreciate the fact that a developer, who’s not able or willing to put some time and thought in the behavior of his code prior to writing or at least prior to compiling and testing it, can only be called a lousy excuse of a developer, for his skill or even his mindset (which is worse, because skills can be improved more easily) is limited at a level so high that it limits the complexity or variety of solutions he can come up with for a given problem or that it even prevents him to solve a more low-level problem at all… now where’s the “business value” in such a developer?

    The point of this exercise was it being an exercise and not a bash against the use of libraries in general… and as the author also pointed out: there aren’t libraries for *everything*

    no offense though, just my two cents…

  17. That’s mostly correct. I could actually drop the __len__() definition, but that would require me explicitly supply the end point to bisect().

    Regarding reopening the file, that’s the sysadmin in me talking. Holding an open handle to huge files isn’t a polite thing to do, and I generally assume that objects are going to be held on to longer than they should :p

    Regarding typing, there are libraries which allow functions to be annotated with type declarations. Though in my opinion, there’s very little benefit to be had in this particular case.

  18. You might consider changing your theme.

    Look at the “white as milk” theme, it should properly realign your stuff, and hopefully make your blog look much nicer :)

    Wonderful blog, please continue to post as regularly as you do so I can keep you on my rss feeds.

  19. Pingback: 2010-04-21 Information Flow « 二三街角

  20. @mike

    Check out this blog post, it gives a pretty interesting view on how we may categorise ourselves:


    I think you misinterpreted me a little bit, perhaps the sarcasm threw you off – it’s a natural reflex I can only apologise for!

    I didn’t say the task was a pointless exercise, indeed I did it myself and enjoyed it (failed though… had to test it in the end!) merely that it is not a scientific approach to solving the solution. But a fun one nonetheless.

    Of course you are correct – any person who writes code should think about it, how else can you actually write it, again, I never even hinted this wasn’t the case.

    My point about developers was meant to be taken as orthogonal to the discussion and is off topic for this post really.

  21. REPROG FAN, I checkout the White As Milk theme on your suggestion, but it doesn’t have a banner image, and I am not giving up my Buffy/Angel, VIC-20, sushi and Veronica Mars (though I am considering swapping Veronica out for The Doctor.) Sorry.

    (Great username you have there, though!)

  22. @Steve
    That seems more understandable to me though I personally don’t agree with the distinction made on that post you linked to. That’s because what’s described as “Developer” there to me sounds more like a description for “Software Architect” which is indeed somewhat different to “Programmer”. Therefore the terms “Programmer” and “Developer” are pretty much the same to me.

    But maybe that shows the exact problem of discussions of topics which tend to produce a great variety of opinions based on rather small differences in the personal understanding of common preliminary assumptions on that topic.

    But that is off-topic for being related to communication theory, or is it? After all, it lead to misunderstanding. Me misunderstanding your message/intention and maybe even you misunderstanding the message/intention of Mike, who knows?

  23. “If that 50% figure is anywhere near accurate, then this group of blog-readers are well ahead of the averages.”

    The first time I wrote a binary search, it was in a language with 1-based array indexing. And I struggled with off-by-one errors. Bsearch is one of a large class of problems where 0-based indexing makes for far simpler code, and much less error-prone code.

    It may be that a large proportion of your readers are brilliant, or that many of them have implemented binary search before, or it just may be that more folks are using languages that do indexing in a way that is a more natural fit to the problem, than in the languages that were in common use when Bentley wrote his column.

  24. “For example, suppose you have a 64-bit integer, and you need to find out whether it’s among the nine billion 64-bit integers that are stored in ascending order in a 72 Gb file.”

    mmap would be the right solution, but I agree completly with your point.

  25. Given a simple, small, well-understood problem, most devs still fail to get it exactly right. If we can’t get this right, then imagine how unlikely we are to correctly implement real world problems which are complex, large and poorly understood. The kids on Reddit can’t think abstractly. They are so focused on the particular problem, binary search, they can’t see the big picture: even seemingly simple problems are difficult to get right. This lesson should reinforce the rule to code defensively. Unfortunately, most devs code cavalierly and accept bugs as inevitable rather than as a personal failure.

    FYI: My implementation had bug #6 from your list. I went to bed without dinner.

  26. Have to admit that I didn’t check for the empty array so make that a 49.8% success rate :(

  27. @mike haha. Well, at least do me the one favour of continuing your posts regularly?!

  28. Phillip Howell


    Playing with your analogy: would you ask an artisan carpenter to do their work without drafting it first? Asking a developer to develop correct code without testing is similar.

    In my case: most of the time, when I’m solving a small problem (on the scale of the bsearch one), my drafting tool is the code, compiler, and test cases. Prototype, test, throw away, implement, test again.

    But it’s a bad analogy in any case, for a few reasons. The first is sort of banal: there’s no material waste involved in our process, only time waste; and if the best way to reduce the time waste is to write-test-write-test, that’s what we should do.

    The second is more interesting: carpenters do not build wood from the atom up. They do not forge their own nails. They do not make their own glue. We do the equivalent of all of these things on a daily basis.

    The third is philosophical: carpenters make things, but we make things that make things. That is to say: what we do is so different from carpentry (and bridge-building, which is the more common bad analogy) that it is a fundamental mistake to try to compare the two.

    The processes we use have to be suited to what we’re doing and the materials and tools we’re working with. We’re not making cabinets and we’re not using wood and hammers.

    We have to find our own way and make our own craft. Testing is an essential part of that craft.

    @Mike Taylor:

    We’re not boxers, pianists, or sprinters, either. I don’t believe I’ll gain anything by doing 200 reps of ‘implement binary search in C++’. For me it comes down to: why implement a toy when there are an infinite number of useful and interesting things to implement? (Note that I did the bsearch exercise anyway, ’cause it felt like cheating to comment on it without having done it.)

  29. Playing with your analogy: would you ask an artisan carpenter to do their work without drafting it first? Asking a developer to develop correct code without testing is similar.

    No; that would be like asking a developer to develop correct code without designing it.

  30. Phillip Howell

    No; that would be like asking a developer to develop correct code without designing it.

    You might want to read the next graf in my comment again in order to understand my extension to the analogy more completely.

    Essentially: the code is the design, and testing is part of designing. (Do you really think that on a large, complex project, the carpenter isn’t going to do some prototyping with cheap materials?)

    To even better understand what I am saying, you should read “What is Software Design?”:

  31. Jeshua Smith

    Or like telling a bridge building to build a bridge without simulating it first? :)

    I have to do coding to do my job, but I don’t code for a living. When I code I’ll think about corner cases and make sure my code handles them correctly, but often I do that by running the code to test each corner case rather than trying to do mental simulation to figure out if the code handles it correctly. My workstation runs the simulation faster and more accurately than my brain.

  32. To even better understand what I am saying, you should read “What is Software Design?”:

    I looked at it, saw that the final conclusion was “C++ is a step in the right direction”, and closed the page immediately :-)

  33. Phillip Howell


    What strong critical reading skills you have, and what an ability to avoid tool-bias.

    It’s worth a serious read with an open mind; it’s not an article about C++, nor about any particular tool.

  34. Here’s my Haskell code that I’d like to explain my methodology
    behind writing it, while we’re talking about methodologies for
    not writing buggy code first time. I wrote it in about 20 minutes
    admittedly, but I’d never seen the binary search before and it
    was on my dinner break.



    find :: (Ord a) => a -> [a] -> Maybe a
    find k [] = Nothing
    find k xs = go (length xs) xs where
        go len [x] | compare k x == EQ = Just x
                   | otherwise         = Nothing
        go len xs = go len' xs' where
            xs' = case last up `compare` k of
                LT -> down
                otherwise -> up
            len' | even len = len `div` 2
                 | otherwise = (len `div` 2) + 1
            (up,down) = (take len' xs,drop len' xs)

    Just to confirm it’s right, I can confirm with QuickCheck:

    Haskell> quickCheck $ \x xs -> find x xs == Data.List.find (==x) xs
    Loading package random- ... linking ... done.
    Loading package mtl- ... linking ... done.
    Loading package QuickCheck- ... linking ... done.
    +++ OK, passed 100 tests.
    it :: ()
    (0.07 secs, 11911296 bytes)

    Cool. Haskell code has this habbit of just working first time.

    First I start by writing the type out:

    find :: (Ord a) => a -> [a] -> Maybe a

    This forces me to think:

    * What does it take (An instance of Ord[erable]) and thus what
    * operations will I use (`compare`). What does it return (either
    * Just a value or Nothing)?

    It takes an orderable value and a list of them (I couldn’t
    remember off the top of my head how to use arrays in Haskell.).

    Next thing I do is start a case analysis, and handle the empty

    find k [] = Nothing

    This method of programming basically encourages me to handle all
    the cases from the get-go.

    find k xs = go (length xs) xs where

    In my algorithm I’m just sorting a list, the range of values
    where the thing I’m searching for might be is kept in the `xs`
    value, and the length of it in the `len` value.

    First I handle the single element case. I don’t need to handle
    the empty case because you can’t half anything to get to 0. If
    I’re at the single element case, I check for equality and
    return Just the value, otherwise nothing. Straight away I’re
    explicitly saying “this function can return nothing at all.”

        go len [x] | compare k x == EQ = Just x
                   | otherwise         = Nothing

    Then I handle the case where it needs splitting. I keep iterating
    the `go` function until I get to the case above. len’ is the
    halved list length, and xs’ is the halved list.

        go len xs = go len' xs' where

    I split the list by getting the “up” and “down” lists
    i.e. elements before and after the mid-point. If the last item of
    the “up” list is less than the value, we go the opposite
    way, “down”. Otherwise (equal to or greater) we go up. This
    basically means if they’re equal we go up anyway because
    the “upper” (positionally e.g. [1,2,3,4,5], [1,2] is “upper”)
    always contains the midpoint element. So we don’t get infinite
    loops and such on duplicates.

            xs' = case last up `compare` k of
                    LT -> down
                    otherwise -> up

    Splitting the length is just me rounding:

            len' | even len = len `div` 2
                 | otherwise = (len `div` 2) + 1
            (up,down) = (take len' xs,drop len' xs)

    if xs == [1,2,3] then len’ == 2, last up == 2, up == [1,2], down == [3]

    if xs == [1,2,3,4] then len’ == 2, last up == 2, up == [1,2], down == [3,4]

    So you can play this out by substitution:

    k = 1
    go 4 [1,2,3,4]
    => go 2 [1,2]
    => go 1 [1]
    => Just 1
    k = 5
    go 4 [1,2,3,4,5,6,7]
    => go 4 [5,6,7]
    => go 2 [5,6]
    => go 1 [5]
    => Just 5

    Please don’t mangle my comment, WordPress. Please don’t eat my comment.

  35. @Phillip Howell

    As I prefaced, the analogy was hardly perfect, however:

    The second is more interesting: carpenters do not build wood from the atom up. They do not forge their own nails. They do not make their own glue. We do the equivalent of all of these things on a daily basis.

    Um, no. We don’t.

    We don’t redesign and rebuild the ALU on a daily basis; we use the operations provided by the CPU. Have you ever implemented a circuit to do 32-bit signed multiplication? Floating point operations? On a daily basis?

    We use the stack management provided by the compiler. Have you ever written a procedure in assembly language that interacts with a C program, taking care to push and pop the correct registers? On a daily basis?

    We use the memory management provided by the operating system’s allocator or by the virtual machine. Have you ever written your own allocator or garbage collector? On a daily basis?

    We use the specification of the language we are writing code in. We most definitely do not create new languages on a daily basis (as individuals at least; perhaps a new language is born every day in the world.)

    The third is philosophical: carpenters make things, but we make things that make things. That is to say: what we do is so different from carpentry (and bridge-building, which is the more common bad analogy) that it is a fundamental mistake to try to compare the two.

    Can a carpenter not build a loom?

    We have to find our own way and make our own craft. Testing is an essential part of that craft.

    I never contested that. I implore you to read again what I wrote about the ways of making a program as bug-free as possible.

  36. Chris Done, congratulations on your correctly pasted and unmangled comment! I think there is a lot for me to learn from this, but the sheer alienness of Haskell means I can’t follow at all. Even the very first line — “find :: (Ord a) => a -> [a] -> Maybe a” — what do all the arrows mean? What’s the difference between -> and =>? What do all the instances of “a” mean?

    Of course I could just go and learn Haskell, but I have the feeling that that might not be totally straightforward, especially at this time of night. So as an intermediate step towards enlightenment, I wonder if it would be possible to translate your solution from Haskell into a more conventional language? Or do the concepts just not fit?

  37. @Chris Done

    Haskell is an fascinating language, and I found your comment interesting, but I feel I must point out that what you presented is not really a solution to the problem.

    The issue is that you are trying to search in a linked list, rather than an array. Not only does this not satisfy the problem as stated, but it is also totally unsuited for binary search. Having picked a linked list for your data, no algorithm can perform any better than simply proceeding though the list one element at a time until either the item is found, a larger item is encountered, or the list is exhausted.

  38. Phillip Howell

    we use the operations provided by the CPU


    … provided by the compiler …


    We use the specification of the language we are writing code in.

    Physical laws.

    Less flippantly:

    We’re looking at different levels of abstractions. Perhaps this is because much of the time I work in a world with extremely limited libraries, but if I need a hammer and haven’t ever used one before, I usually need to build that hammer. And just about everything I write is a tool I’ve never used before.

    So, I disagree with your “Um, no. We don’t.”

    Oh, and the answers to your “have you ever” and “on a daily basis” questions are no, no, yes, yes, yes, no.

    Can a carpenter not build a loom?

    Yes, but we cannot build a chair. The code that we produce is more similar to the draft that the carpenter produced, not the chair made from it. The code isn’t the actual product.

    I never contested that. I implore you to read again what I wrote about the ways of making a program as bug-free as possible.

    You make a valid point here; I got caught up in your analogy and did not give what you were actually saying a fair reading because of it.

    I would agree that testing is only a part of a development process toolset, and that there are heuristics we can use to make software more correct from the start.

    I think that correct absolute rules are exceedingly rare, though. I’m a big fan of having a bunch of different tools available to me and picking the right ones for whatever I’m doing, using simple guidelines (e.g., “Choose a data representation that makes your program simple).

    (In particular, I think that “always use half-open ranges” is incorrect. Sometimes it’s simpler to represent ranges as start position and length.)


  39. William Laffin

    Well my code passed all 4096 test cases. Thank you for this post, because after reading proggit comments I thought this was an elaborate nerd snipe.

  40. Honestly, I would expect a good artisan carpenter to be able to design a simple box in their head and implement it in wood. Similarly, I would expect a good programmer to be able to design a simple algorithm in their head, and implement it in code.

    I guess the way I see it is this. If you don’t have a good enough model of the computer in your head to be able to test your code without hitting run, why would you expect any of your code to work correctly? Sure, testing programs is important to catch mistakes and prevent regressions, but fundamentally, you should know what the code you write does.

  41. Off topic, but just wanted to point out the first line of the post with the same link repeated several times breaks the layout in my feed reader :-(

    [Mike says: thanks for spotting that, John, now fixed,]

  42. Dan Scott, I know it’s only a list and so it doesn’t satisfy O(log n), but I didn’t know how to use Haskell arrays off-by-heart and wanted to write the code in one go. It was still a good exercise. Seems a lot of people had more fundamental problems to worry about, haha.

    Someone else did a proper array solution here if you’re interested:

  43. Mike, I think my declaration in C# (if you know that) would look something like:

        public class Find<T> {
          Maybe<T> find(IComparer<T> k, List<T> xs) {

    But yeah, you’re right. You probably just need to know Haskell for it to make sense. I’d just be explaining fundamentals.

    As an aside, it would be fun to see someone working on an isolated problem like this with vnc2swf or something, from start to finish, to see how they work.

  44. @Mike Taylor: re. “it continues to appal me that neither Ruby nor Perl, nor AFAIK Python, has any way at all to talk about types here — something like Java interfaces would be a huge boon.”

    There’s no direct equivalent, but as of Python 2.6, we have Abstract Base Classes, which can be used as a way to talk about what functionality an object supports. More info at

    There’s also several pre-existing add-on packages that allow similar kinds of declarations and introspection; probably the most featureful and mature is zope.interface,

    For whatever reason, the language maintainers chose not to fold zope.interface or something like it into the standard library, and did ABCs instead. I didn’t really pay attention to the discussions.

  45. Pingback: Top Posts —

  46. @Antti: A better example is to ask what a draftsman will produce when asked to provide a drawing of a wooden box, or better, something a bit more complicated. Programming is like a number of arts in that the draft and the final item are made of the same stuff. Surely, one does not expect a novelist to produce a good novel in one draft, nor a musician an overture, nor a painter a landscape. Doing the actual carpentry is more like deploying the software. You know: test twice, install once.

    Software is something that learns once it is out in the field, so there are good arguments for getting it out there before it is perfect. Apple and Google are notorious for this. Look at all the whining about the first iPhone release and Google never letting any software get past beta. A good analogy is to architecture. I’ll recommend Stewart Brand’s book, How Buildings Learn, in which he deals with the ways buildings are adapted to their uses. A good test of software is not its initial perfection, but how it can be improved.

  47. Everyone seems to get caught up in dissecting my analogy and not addressing the actual point. Maybe I should have just left it out. I’ll try to present the idea in the rawest terms possible:

    I make the assertion that it is a virtue to have as few bugs as possible in released software, whether it’s a beta release or a final version. I do not make the assertion that having as few bugs as possible is the only priority.

    There’s a set Bpot of potential bugs in a software project being written. A subset of Bpot, Btest will be detected by testing. A subset of Bpot, Bmet will be prevented by having a sound methodology when writing code.

    I make the assertion that Btest and Bmet overlap, but are not the same set. In particular, Bmet is not a subset of Btest.

    Therefore, while testing is important, it does not replace craftsmanship, but complements it. This is why I also believe that an exercise such as writing a binary search algorithm without testing is useful, because it is an opportunity to discover in clear terms where one can improve one’s craft, and increase the size of Bmet.

  48. Not coping with repeated elements in the array

    How would this be a problem? The problem statement does not require finding the leftmost or rightmost match

  49. I haven’t had time to read through all these comments, or to really test the one I did on the train yesterday to find the (inevitable)bugs. I just felt like I needed to chime in quickly and say how I found the exercise actually kind of fun and thought provoking. I found that I found a number of big mistakes by just sitting and thinking it over in my head instead of just alt-tabbing over to the terminal from gvim and throw my code at gcc immediately.

    As a very green recent CS grad, maybe I am just much less sure of my abilities than some, but I think this kind of thing might be a really awesome regular exercise. Pick some algorithm and just see how well you do, then follow up with lots of other people to pick apart all the solutions.

    Anyway, to Mike: thanks for starting up a lot of interesting discussion!

  50. OK I failed by your definition of binary search. Although I would argue that a zero sized array is not sorted as to be sorted the elements would be in order. Yet with no elements there is no order.

    It is interesting that you bring up K&R’s guides then also note your rule 6 as I would have TDDed this.

  51. No, a zero-sized array absolutely is sorted: just like an array of size 1, it contains no pair of elements a[i] and a[j] for j>i where a[i]>a[j]. And, seriously, what would you think of a search routine in a library that barfed if you asked it to search an empty array? The answer to the question “Does this array contain the value 42?” is a simple no when the array is empty.

  52. Interesting Mike. If there are no keys, can the Law of Trichotomy be met?

    “And, seriously, what would you think of a search routine in a library that barfed if you asked it to search an empty array?”
    If it had a precondition that an array was sorted I would think it was my fault for not meeting the pre condition.

  53. Liam, I just don’t know what to say to you. The idea that a zero-length array isn’t sorted is as bizarre to me as the idea that zero isn’t a legitimate value to search for. It’s just … from a different planet.

  54. Mike maybe I am from another planet :)
    Please do correct me in my thinking.
    As I understand the situation, for any set to be considered sortable it must meet the law which states that any key in the set is greater, less than or equal to itself. A one element set meets the condition as it is equal to itself yet a empty set does not as it has no keys to compare.

  55. Mike, I think that strictly speaking, an empty set is neither (or both) ordered or disordered, though I’d have to slog through discussions of ordinality to be sure. The precondition “sorted set” might, by a theoretician, imply non-empty. And that is precisely why you shouldn’t let theoreticians make the practical code… :)

    Generally, I find the “why do this” attitude a little bizarre. Why should an expert in a language do crosswords? I mean dictionaries exist, right? I tend to practice multiplying largish numbers in my head… why not use the calculator? Mental exercises are worth keeping sharp for their own sake, and if they have a professional tie, so much the better.

  56. There is a good quote, but I can’t remember who it’s by. It makes a compelling argument for being good at the fundamentals of programming to be a good programmer. I’m afraid I can’t really repeat it because it relies on analogies I don’t remember.

    Put basically, you can’t learn a piano piece without knowing what notes are, or what chords are, thoroughly, otherwise you’re simultaneously trying to deal with notes and chords and putting all that together into a melody and physical skill. You can sort of get by but it takes you twice as long and you need trial and error and get some things wrong without knowing why.

    You might say fair enough, no one is disputing that. But if you can’t even write a simple algorithm like binary search without little bugs in it, and this is a really isolated and simple referentially transparent procedure, you must really struggle when you’re combining large complex systems together with a similar algorithmic complexity (which isn’t all that complex to begin with). I suspect the kind of thinking that goes into small problems is the same kind that is used at a higher level so it’s useful to practice it when it’s as simple as this.

  57. /* 
        C - MSVC 6 Implementation - Tested via the 4096 test cases
        (thank you Steve)
        I am a 90%er.
        1) incorrectly transcribed from paper "<=" became "<"
        2) The api used size_t for subscripts.  This fails, as the 
           algorithm runs while start <= end... when end changes from 
           0 to -1, it actually becomes a very large number instead.
        This was a humbling exercise indeed.   
        Thank you for the entertainment Mike.
    #include <stdio.h>
    #include <stdlib.h>
    #include <assert.h>
    int binary_search      /* Return 1 when found, */ 
    (                      /* and 0 when not found */ 
        int  *haystack,    /* Search this array    */ 
        int  start,        /* First subscript      */ 
        int  end,          /* Last subscript       */ 
        int  needle,       /* For this value       */ 
        int *subscript     /* Set this to where    */ 
                           /* the value was found  */ 
        if (haystack != NULL && subscript != NULL)
            while (start <= end)
                size_t mid = (start + end) / 2;
                if (haystack[mid] < needle)
                    start = mid + 1;
                    if (haystack[mid] > needle)
                        end = mid - 1;
                        *subscript = mid;
                        return 1;
        return 0;
    int main(void)
      FILE *fp = fopen("tests.txt", "rt");
      if (fp != NULL)
        char buffer[1000] = {0};
        int array[100000] = {0};
        int count = 0;
        int found;
        int result;
        int sub;
        while ((fgets(buffer, sizeof(buffer) - 1, fp)) != NULL)
          int lookFor;
          if (buffer[0] == 'P')
            printf("%s", buffer);
            fgets(buffer, sizeof(buffer) - 1, fp);
            lookFor = atoi(buffer);
            fgets(buffer, sizeof(buffer) - 1, fp);
            /* Now we load zero or more values */ 
            while ((fgets(buffer, sizeof(buffer) - 1, fp)) != NULL)
              if (buffer[0] == ']')
                found = buffer[3] == 'y' ? 1 : 0;
                result = binary_search(array, 0, count - 1, lookFor, &sub);
                assert (found == result);
                fgets(buffer, sizeof(buffer) - 1, fp);
                count = 0;
                /* Read past the line */ 
                if (fgets(buffer, sizeof(buffer) - 1, fp) != NULL)
                  goto AGAIN;
                array[count] = atoi(buffer);
                assert (count < 100000);
            printf("Out of sync, looking for Problem\n");
        printf("I can't seem to open tests.txt\n");
      return 0;
  58. Phillip Howell


    Therefore, while testing is important, it does not replace craftsmanship, but complements it.

    This is where we disagree: I believe that testing is a core part of craftsmanship and cannot be extracted into a separate thing.

    For baby-problems like binary search? Fine, it’s possible to craft without test. For more complex algorithms (think stuff on the order of stable marriage; or really most assignment problems), I don’t believe it is for most people — even for most people in the binary-search 10%.

    One of the best ways I’ve found to get the “opportunity to discover in clear terms where one can improve one’s craft” is sitting down and reading code with other good developers, looking for flaws.

    Where we do agree, I suspect, is that there’s no way to hone one’s craft without doing.

    @Chris Done

    I suspect the kind of thinking that goes into small problems is the same kind that is used at a higher level

    My experience is that this isn’t true. For me, systems-level thought is of a substantially different sort than algorithmic thought.

  59. Randy Hudson


    Think of it this way: in a sorted list L, there are no indices (m,n) such that mL[n]. (That is, there are no elements in the list out of order.) This is trivially true for an empty list.

    If that seems a strange (or strained) view to you, consider that most sorting algorithms, from the lowly Bubble to the mighty Quick, work by finding out-of-order pairs and switching them, until there are no more out-of-order pairs.

  60. Randy Hudson

    Uh, for “such that mL[n]” in my previous post, read “such that m less than n and L[m] greater than L[n]”.

  61. Pingback: From “The Reinvigorated Programmer”: Are you one of the 10% of programmers who can write a binary search? « Brendan Graetz

  62. Pingback: Another challenge: can you write a correct selection sort? « The Reinvigorated Programmer

  63. Pingback: Programming Books, part 5: Programming Pearls | The Reinvigorated Programmer

  64. Pingback: Are you one of the 10% of programmers who can write a binary search? | The Reinvigorated Programmer

  65. Pingback: Testing is not a substitute for thinking (binary search part 3) | The Reinvigorated Programmer

  66. Pingback: Writing correct code, part 1: invariants (binary search part 4a) | The Reinvigorated Programmer

  67. Pingback: Writing correct code, part 2: bound functions (binary search part 4b) | The Reinvigorated Programmer

  68. Pingback: Writing correct code, part 3: preconditions and postconditions (binary search part 4c) | The Reinvigorated Programmer

  69. Pingback: Binary Search Reconsidered (2018) - JellyEnt

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.