There are some programming books that I’ve read from cover to cover repeatedly; there are others that I have dipped into many times, reading a chapter or so at a time. Jon Bentley’s 1986 classic Programming Pearls is a rare case where both of these are true, as the scuffs at the bottom of my copy’s cover attest:
(I have the First Edition [amazon.com, amazon.co.uk], so that’s what I scanned for the cover image above, but it would probably make more sense to get the newer and cheaper Second Edition [amazon.com, amazon.co.uk] which apparently has three additional chapters.)
I’ll review this book properly in a forthcoming article (as I did for Coders at Work, The Elements of Programming Style, Programming the Commodore 64 and The C Programming Language), but for now I want to look at just one passage from the book, and consider what it means. One astounding passage.
Only 10% of programmers can write a binary search
Every single time I read Programming Pearls, this passage brings me up short:
Binary search solves the problem [of searching within a pre-sorted array] by keeping track of a range within the array in which T [i.e. the sought value] must be if it is anywhere in the array. Initially, the range is the entire array. The range is shrunk by comparing its middle element to T and discarding half the range. The process continues until T is discovered in the array, or until the range in which it must lie is known to be empty. In an N-element table, the search uses roughly log(2) N comparisons.
Most programmers think that with the above description in hand, writing the code is easy; they’re wrong. The only way you’ll believe this is by putting down this column right now and writing the code yourself. Try it.
I’ve assigned this problem in courses at Bell Labs and IBM. Professional programmers had a couple of hours to convert the above description into a program in the language of their choice; a high-level pseudocode was fine. At the end of the specified time, almost all the programmers reported that they had correct code for the task. We would then take thirty minutes to examine their code, which the programmers did with test cases. In several classes and with over a hundred programmers, the results varied little: ninety percent of the programmers found bugs in their programs (and I wasn’t always convinced of the correctness of the code in which no bugs were found).
I was amazed: given ample time, only about ten percent of professional programmers were able to get this small program right. But they aren’t the only ones to find this task difficult: in the history in Section 6.2.1 of his Sorting and Searching, Knuth points out that while the first binary search was published in 1946, the first published binary search without bugs did not appear until 1962.
— Jon Bentley, Programming Pearls (1st edition), pp. 35-36.
Several hours! Ninety percent! Dude, SRSLY! Isn’t that terrifying?
One of the reasons I’d like to see a copy of the Second Edition is to see whether this passage has changed — whether the numbers improved between 1986 and the Second-Edition date of 1999. My gut tells me that the numbers must have improved, that things can’t be that bad; yet logic tells me that in an age when programmers spend more time plugging libraries together than writing actual code, core algorithmic skills are likely if anything to have declined. And remember, these were not doofus programmers that Bentley was working with: they were professionals at Bell Labs and IBM. You’d expect them to be well ahead of the curve.
And so, the Great Binary Search Experiment
I would like you, if you would, to go away and do the exercise right now. (Well, not right now. Finish reading this article first!) I am confident that nearly everyone who reads this blog is already familiar with the binary search algorithm, but for those of you who are not, Bentley’s description above should suffice. Please fire up an editor buffer, and write a binary search routine. When you’ve decided it’s correct, commit to that version. Then test it, and tell me in the comments below whether you got it right first time. Surely — surely — we can beat Bentley’s 10% hit-rate?
Here are the rules:
- Use whatever programming language you like.
- No cutting, pasting or otherwise copying code. Don’t even look at other binary search code until you’re done.
- I need hardly say, no calling bsearch(), or otherwise cheating :-)
- Take as long as you like — you might finish, and feel confident in your code, after five minutes; or you’re welcome to take eight hours if you want (if you have the time to spare).
- You’re allowed to use your compiler to shake out mechanical bugs such as syntax errors or failure to initialise variables, but …
- NO TESTING until after you’ve decided your program is correct.
- Finally, the most important one: if you decide to begin this exercise, then you must report — either to say that you succeeded, failed or abandoned the attempt. Otherwise the figures will be skewed towards success.
(For the purposes of this exercise, the possibility of numeric overflow in index calculations can be ignored. That condition is described here but DO NOT FOLLOW THAT LINK until after writing your program, if you’re participating, because the article contains a correct binary search implementation that you don’t want to see before working on your clean-room implementation.)
If your code does turn out to be correct, and if you wish, you’re welcome to paste that code into your comment … But if you do, and if a subsequent commenter points out a bug in it, you need to be prepared to deal with the public shame :-)
For extra credit: those of you who are really confident in your programming chops may write the program, publish it in a comment here and then test it. If you do that, you’ll probably want to mention the fact in your comment, so we cut you extra slack when we find your bugs.
I will of course summarise the results of this exercise — let’s say, in one week’s time.
Let’s go!
Update (an hour and a half later)
Thanks for the many posted entries already! I should have warned you that the WordPress comment system interprets HTML, and so eats code fragments like
if a[mid] < value
The best way to avoid this is to wrap your source code in {source}…{/source} tags, but using square brackets rather than curly. (The first time I tried to tell you all this, I used literal square brackets, and my markup-circumvention instructions were themselves marked up — D’oh!). Do not manually escape < and > as < and > — the {source} wrapper deals with these. Doing it this way also has the benefit of preserving indentation, which no other method seems to do.
And an apology for WordPress: I really, really wish that this platform allowed commenters to preview their comments and/or edit them after posting, so that all the screwed-up source code could have been avoided. I’ve tried to go and fix some of them myself, but — arrgh! — it turns out that WordPress not only displays code with < symbols wrongly, it actually throws away what follows, so there’s nothing for me to restore.
Update 2 (four hours after the initial post)
Wow, you guys are amazing. Four hours, and this post already has more comments than the previous record holder (Whatever Happened to Programming, 206 comments at the time of writing.)
For anyone who’d like to see more discussion, there are some good comments at Hacker News and perhaps some slightly less insightful comments at Reddit, where actually writing code is seen as “elitism”.
Update 3: links to this whole series
- Are you one of the 10% of programmers who can write a binary search?
- Common bugs and why exercises matter (binary search part 2)
- Testing is not a substitute for thinking (binary search part 3)
- Writing correct code, part 1: invariants (binary search part 4a)
- Writing correct code, part 2: bound functions (binary search part 4b)
- Writing correct code, part 3: preconditions and postconditions (binary search part 4c)
- Buffy: Season 1, revisited
I have the second edition and can tell you that the section you quoted above is essentially unchanged. The reference to IBM and Bell Labs was replaced by the more general “I’ve assigned this problem in courses for professional programmers” but the numbers are still there.
int binarySearch(int[] a, int value) {
int low = 0;
int high = a.length – 1;
while (low <= high) {
int mid = low + (high – low)/2;
int midValue = a[mid];
if (value midValue) {
low = mid + 1;
} else {
return mid;
}
}
return -1;
}
(not tested, typed in comment box)
Gah, once I hit “put down this column and write the code yourself”, I did. Failed to read the rules that said don’t test it. So essentially, I failed by not reading specifications, which is probably just as bad.
Implementation:
<?php
$a = array();
$k = 10;
for ($i = 0; $i < 500; $i++) {
$k += rand(1,20);
$a[] = $k;
}
foreach ($a as $v) {
echo $v.' ';
}
echo "\n";
for ($i = 0; $i $right) {
return false;
}
$k = floor(($left+$right)/2.0);
if ($array[$k] == $lookfor) {
return $k;
}
if ($array[$k] > $lookfor) {
return search($array, $left, $k-1, $lookfor);
}
return search($array, $k+1, $right, $lookfor);
}
?>
Correct on first run, according to the included test cases. Bug reports welcome.
Seemed to work for a couple quick tests, but blew up when searching for something that wasn’t in the list.
def bsearch_helper(list, target, low, hi):
if low > hi:
return None
mid = (low + hi) / 2
m = list[mid]
c = cmp(m, target)
if c == 0:
return mid
elif c < 0:
return bsearch_helper(list, target, mid + 1, hi)
else:
return bsearch_helper(list, target, low, mid – 1)
def bsearch(list, target):
return bsearch_helper(list, target, 0, len(list))
Hmmm, HTML fail. Let’s try again.
(not tested, typed in comment box)
Code below for any more crowdsourced debugging:
int search(int term, int * array, int size) {
int mid = size / 2;
if (array[mid] > term) return search(term, array, mid);
if (array[mid] < term) return mid + search(term, array + mid, size – mid);
return mid;
}
Success. Altho I did it recursively.
Has been tested:
static int binarySearch(int[] values, int val) throws Exception
{
return binarySearchHelp(values, val, 0, values.length – 1);
}
static int binarySearchHelp(int[] values, int val, int start, int end)
throws Exception
{
if (start > end)
throw new Exception(“Somehow indexes have gotten reversed”);
if (start == end)
return values[start]==val ? start : -1;
int mid = (start + end) / 2;
if (values[mid] > val)
return binarySearchHelp(values, val, start, mid);
else if (values[mid] < val)
return binarySearchHelp(values, val, mid + 1, end);
else
return values[mid]==val ? mid : -1;
}
@Josh: You sometimes return a boolean (False) and sometimes an integer (mid). Assuming you meant to return True instead of mid, you risk an infinite loop because you don’t guarantee that your interval gets smaller each step.
Don’t forget to account for numbers outside the range of your sorted array. I forgot it in my first attempt, so searching for something less than the first element or greater than the last would result in an infinite loop!
def find(list, n)
mid = (list.size / 2).ceil
target = list[mid]
# Arg, I failed!
return false if n > list.last || n target
return find(list[mid,list.size],n)
end
if n < target
return find(list[0,mid],n)
end
true
end
list = (0..1001).to_a
puts find(list,500)
puts find(list,list.first)
puts find(list,list.last)
puts find(list,1000)
puts find(list,33)
puts find(list,-1)
puts find(list,1002)
Code here:
http://pastebin.com/ms6BYwyy
I was not confident enough to post it without testing, and rightly so because I had a bug because I wrote the code in a way that seemed very elegant to me but that turned out to loop forever.
The original comparison was:
if (array[h] = x) hi = h + 1;
Which would terminate the loop immediately if it so happened that array[h] == x, but which would lead to an infinite loop if the interval was 2 elements large with the first element smaller than the search value and the second element larger (as with my first test case).
Time taken: 10 minutes.
I wrote it, and when I was sure it worked, I tested it. Not a single correction had to be made. Recursive algorithms are easy to think. I would have probably failed writing an iterative version.
bool binSearch(std::vector const &v, int key, int first, int last)
{
if (last – first == 1)
return v[first] == key;
int mid = (first+last)/2;
if (key < v[mid])
{
return binSearch(v, key, first, mid);
}
else
{
return binSearch(v, key, mid, last);
}
}
Argh, it seems smaller-than and greater-than signs are not escaped.
That was supposed to read
if (array[h] x) hi = h;
I fail. Buggy as crap. I bring shame to professional programmers everywhere.
int bs(int len, int array[len], int t)
{
int start = 0, end = len;
while (start < end) {
int m = (start + end) / 2;
if (array[m] t) end = m;
else return m;
}
return -1;
}
int binarySearch(int array[], int value, int low, int high)
if(low > high)
return -1;
int midPoint = low + (high-low)/2;
int midValue = array[midPoint];
if(value == midValue) {
return midPoint;
}else if(value > midValue) {
return binarySearch(array, value, midPoint+1, high);
}else {
return binarySearch(array, value, low, midPoint-1);
}
}
// Didn’t test it. Just used pen and paper.
// This could stackoverflow if the compiler
// doesn’t support tail recursion.
Wrote it in Emacs (with SLIME), committed (^X ^E), fixed one syntax error (LET -> LET*), then hand-tested with a few corner cases in the REPL. I think I got it right, but I feel strangely unconfident…
VB.NET – I did get it wrong the first time, I had the upper and lower reversed when I was checking the startpoint.
Dim numarray(19) As Integer
Dim t As Integer = 7
numarray(0) = 1
numarray(1) = 2
numarray(2) = 3
numarray(3) = 4
numarray(4) = 5
numarray(5) = 6
numarray(6) = 7
numarray(7) = 8
numarray(8) = 9
numarray(9) = 10
numarray(10) = 11
numarray(11) = 12
numarray(12) = 13
numarray(13) = 14
numarray(14) = 15
numarray(15) = 16
numarray(16) = 17
numarray(17) = 18
numarray(18) = 19
numarray(19) = 20
Dim startpoint As Integer
Dim lower = 0
Dim upper = 19
Do
startpoint = (lower + upper) \ 2
If numarray(startpoint) = t Then
Debug.Print(“found it”)
Exit Do
End If
If upper = lower Then
Debug.Print(“not found”)
Exit Do
End If
If numarray(startpoint) < t Then
lower = startpoint
Else
upper = startpoint
End If
Loop
@Lawrence Kesteloot I mean to return False if it’s not found, otherwise return the index where it is found, similar to how you return a -1. I’m still prepared to don my ribbons of shame.
NO TESTING until after you’ve decided your “program is correct.”
Who cares what the results are with this rule in place? You’re not measuring anything that applies to real world development.
Next, you’ll tell me only 10% of programmers produce x lines of code per year.
first posting messed up by html, trying again
This was my first attempt. I haven’t discovered any bugs yet.
Thought about this a little bit more, since you said I was likely to screw it up. I’ll test it after I submit, since that seems to be the spirit of the exercise.
Hope I don’t blow it :)
Wrote a recursive solution in python with more or less no error checking. Worked until I changed from printing results to returning them and forgot to make the recursive calls return statements. Otherwise working fairly well. What am I missing?
This probably won’t look pretty:
def bsearch(listy, val, index):
if len(listy) == 1:
if listy[0] != val:
print “ERRRRRRRRRRRRROOOOOOOOOOOOOOORRRRRRRRRRRRRRRRRRRRRR”
return -1
else:
return index
else:
new_ind = len(listy)/2
if listy[new_ind] == val:
return index+new_ind
elif listy[new_ind] < val:
return(bsearch(listy[new_ind+1:], val, index+new_ind+1))
else:
return(bsearch(listy[:new_ind], val, index))
@Josh: If you want to return False, then the case of “low == high” is wrong because it returns “array[low] == x”. I think you can just remove that case altogether as long as you fix the recursion parameters.
@Daniel: Looks good.
##Woohoo!! Got right the first time:)
def bsearch(arr,key,start=0,end=None):
if end == None: end = len(arr) – 1
if start > end: return None
if start == end and arr[start] != key: return None
mid = (start+end)/2
if arr[mid] == key:
return mid
if arr[mid] > key:
return bsearch(arr,key,start,mid-1)
if arr[mid] < key:
return bsearch(arr,key,mid+1,end)
Don’t forget to account for numbers outside the range of your sorted array. I forgot it in my first attempt, so searching for something less than the first element or greater than the last would result in an infinite loop!
Note: Duh, fixed for html
Time taken: 8 minutes
Result: failed. My original attempt had “end = mid – 1” instead of “end = mid”. That’s the only change I made from the original version.
Total testing/fixing time: 5 minutes
So 13 minutes total. I wouldn’t be horribly surprised if there are more bugs in there…
int bsearch_int( const int * values, const size_t n, const int key )
{
int begin = 0;
int end = n;
while( begin < end )
{
const int mid = begin + (end-begin)/2;
const int midval = values[mid];
if( key midval )
begin = mid + 1;
else
return mid;
}
return -1;
}
a and b were initialised in the body but then I saw Lisp version :)
Ruby and other high-level languages are like high level pseudocode so that’s why it’s easy (but you can trip yourself if you don’t test carefully). Hope I got it right :)
I wasn’t going to bother posting, but since the only Python versions so far are recursive, here’s a really awful “whipped up in five minutes” iterative version. Hopefully, WP won’t mangle this too badly.
Well. First go, I wrote this:
long *binsearch( long *ary, long sz, long item)
{
if( 0 == sz ) return NULL; /* 0-length array */
if( ary[ sz>>1 ] == item ) return &ary[ sz>>1 ];
if( ary[ sz>>1 ] >1 ], sz>>1 + sz&1 – 1, item );
return binsearch( ary, sz>>1, item );
}
I was convinced that the logic was sound. C disagreed. Not being a daily user of C, it took me a while to figure it out… for reference, here’s the corrected version:
long *binsearch( long *ary, long sz, long item)
{
if( 0 == sz ) return NULL; /* 0-length array */
if( ary[ sz>>1 ] == item ) return &ary[ sz>>1 ];
if( ary[ sz>>1 ] >1 ], sz>>1 + (sz&1) – 1, item );
return binsearch( ary, sz>>1, item );
}
Python indentation is wrong in the previous submission!!!
I have put up my code at
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bsearch.py
hosted with ❤ by GitHub
Erm, nevermind, looks like someone else already posted one. :)
Tested with first/middle/last element in even-/odd-size arrays. Works as far as I can tell.
python example, only cursory testing so far so be gentle:
def bsearch(lst, item):
bottom, top = 0, len(lst)
while top – bottom >= 3:
mid = (top + bottom) // 2
c = cmp(item, lst[mid])
if c 0:
bottom = mid + 1
else:
return True
if item == lst[bottom]:
return True
return top – bottom == 2 and item == lst[bottom + 1]
I’ll throw in my extremely unoptimized php version, just to represent the web dudes… (And yes, I tested it)
2)
{
$iMiddle = round($iArraySize/2);
}
elseif($iArraySize == 2)
{
$iMiddle = 0;
}
else
{
$bQuit = true;
print “\nRan out of values, is the answer: $aItems[0]?”;
exit();
}
$iTestValue = $aItems[$iMiddle];
print “Testing round $i”;
print “\nArray size: ” . $iArraySize;
print “\nMiddle of Array: ” . $iMiddle;
print “\nLooking for Value ” . $iVal;
print “\nFound Value: ” . $iTestValue;
if($iTestValue == $iVal)
{
//yay we found it!
print “\nYay they match!”;
$bQuit = true;
}
else
{
if($iArraySize > 2)
{
if($iTestValue > $iVal)
{
$aItems = array_slice($aItems, 0, $iMiddle);
}
else
{
$aItems = array_slice($aItems, $iMiddle );
}
}
else
{
$aItems = array($aItems[1]);
}
}
}
?>
Damn. HTML munched my code. :(
Untested (but compiled). Scrolling down to the comment box, I couldn’t help but glance at the other submissions. Fortunately, due to their similarity, that only increased my faith in my own attempt. Perhaps it would be best if they were somehow hidden for the next week, though…
I wish there were a comment preview – no idea whether this will be formatted correctly.
2)
{
$iMiddle = round($iArraySize/2);
}
elseif($iArraySize == 2)
{
$iMiddle = 0;
}
else
{
$bQuit = true;
print “\nRan out of values, is the answer: $aItems[0]?”;
exit();
}
$iTestValue = $aItems[$iMiddle];
print “Testing round $i”;
print “\nArray size: ” . $iArraySize;
print “\nMiddle of Array: ” . $iMiddle;
print “\nLooking for Value ” . $iVal;
print “\nFound Value: ” . $iTestValue;
if($iTestValue == $iVal)
{
//yay we found it!
print “\nYay they match!”;
$bQuit = true;
}
else
{
if($iArraySize > 2)
{
if($iTestValue > $iVal)
{
$aItems = array_slice($aItems, 0, $iMiddle);
}
else
{
$aItems = array_slice($aItems, $iMiddle );
}
}
else
{
$aItems = array($aItems[1]);
}
}
}
? >
Let’s try that again…
Okay, it doesn’t. Screws up for values less than the array minimum. Ah, well.
That was fun – I took about 10 minutes and couldn’t wait to try it so I failed the test. I’m in the 90%. My submission has two problems, it compares the value to the index instead of the value and it doesn’t terminate when the value isn’t there. Both problems were easily fixed once I identified them.
#include
#include
static int data[10000];
static int load(char *fname)
{
FILE *fp = fopen(fname, “r”);
int dc = 0;
int *dp = data;
if (fp)
{
while (fscanf(fp, “%d”, dp) > 0)
dp++;
dc = dp-data;
printf(“read %d values\n”, dc);
}
return dc;
}
int main(int argc, char **argv)
{
int target = atoi(argv[2]);
int count = load(argv[1]);
int bot = 0;
int top = count-1;
int pivot;
int found = 0;
if (count == 0)
return -1;
while (!found && bot != top)
{
pivot = (bot+top)/2;
printf(“[%d] %d [%d] %d [%d] %d\n”, bot, data[bot], pivot, data[pivot], top, data[top]);
if (target > data[pivot])
bot = pivot;
else if (target < data[pivot])
top = pivot;
else
{
printf("found %d at index %d\n", target, pivot);
found = 1;
}
}
return found;
}
I attempted and failed. I instinctively hit the “Run” button before I was actually done, because it’s so ingrained to test the code at each step.
How do we know if we pass?
[ int BinarySearch(int value,int low, int hi, int[] list) {
var middle=((hi-low) / 2)+low;
if (hi<=low||list[middle]==value) {
return (list[middle]==value ? middle : -1);
}
if(list[middle]value) {
return BinarySearch(value, low, middle-1, list);
}
return -1;
}]
Most of my previous comment got eaten. This is what worked for me.
Bugfixing
Sorry, don’t count mine, I did test it. I recognise I am not able to “code on paper”, that’s no news for me :)
indentation. bah. http://gist.github.com/371492
Well, mine turned out a bit wordier than a lot of other people’s… bad programmer! (and to think I taught myself coding in the age of 16K machines with cassette drives… I’ve gotten fat and lazy.)
Like many others, I return the index location if it’s found and -1 if it’s not.
Thought of doing it recursively, which would be more elegant, but just went for the most straightforward brute-force approach. Hopefully, it actually works for all cases.
java.lang.String items[]=new String[50];//0 indexed
int arraylength=items.length-1;
boolean found=false;
int lowbound=0;
int midpoint=-1;
int highbound=arraylength;
int foundIndex=-1;
for(int i=0;i<50;i++)
{
items[i]=String.valueOf(i);
}
Arrays.sort(items);
String item2find=new String("30");
int passes=0;
while(!found)
{
//first, see if object can exist in range
passes++;
if ((item2find.compareTo(items[lowbound])0))
{
found=true;
foundIndex=-1;
continue;
}
midpoint=lowbound+((highbound-lowbound)/2);
if(items[midpoint].equals(item2find))
{
found=true;
foundIndex=midpoint;
continue;
}
if(highbound==lowbound)
{
found=true;
foundIndex=-1;
continue;
}
if(item2find.compareTo(items[midpoint])<0)
{
highbound=midpoint-1;
}
else
{
lowbound=midpoint+1;
}
}
System.out.println(passes+" passes. Location:"+foundIndex);
In terms of "Worked on first try"… well, sortakinda. :) I needed to add arrays.sort. Once the array was, in fact, sorted, the algorithm SEEMS to work. If I put in an element that's not there, it returns -1, if I put in an element that is there, it finds it. I'm sure I'm missing something, I always do….
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gistfile1.pyw
hosted with ❤ by GitHub
lo and behold, I run the test for the first time and… voila! here’s a bug!
I must admit it was a pure WTF moment, but the error turned out to be a typo — the program still parsed and ran, it just… well, didn’t work. ;)
Ah, screw it.
long *binsearch( long *ary, long sz, long item)
{
if( 0 == sz ) return NULL; /* 0-length array */
if( ary[ sz>>1 ] == item ) return &ary[ sz>>1 ];
if( ary[ sz>>1 ] < item ) return binsearch( &ary[ 1 + sz>>1 ], sz>>1 + sz&1 – 1, item );
return binsearch( ary, sz>>1, item );
}
def bsearch(target,array,offset=0):
l = len(array)
if l == 0:
return False
elif l == 1:
if array[0] == target:
return offset
else:
return False
else:
midpos = l//2
midval = array[midpos]
if midval == target:
return offset + midpos
elif midval < target:
return bsearch(target,array[midpos:],offset+midpos)
else:
return bsearch(target,array[:midpos],offset)
Again with formatting
Bah, markup ate my attempt. Here’s how it’s supposed to read: http://pastebay.com/94342
@mike, your instructions on how to circumvent wordpress’ markup got marked up itself, so it can’t be read…. :)
#include
#include
int binary_search_helper(int *array, int search_from, int search_to, int needle)
{
int subset_length = search_to – search_from + 1;
int midpoint = search_from + subset_length / 2;
if (search_from == search_to)
{
if (needle == array[search_from]) return search_from;
return -1;
}
if (needle == array[midpoint]) return midpoint;
if (needle < array[midpoint] && midpoint != search_from) return binary_search_helper(array, search_from, midpoint – 1, needle);
else if (needle > array[midpoint] && midpoint != search_to) return binary_search_helper(array, midpoint + 1, search_to, needle);
else return -1;
}
int binary_search(int *array, int length, int needle)
{
return binary_search_helper(array, 0, length – 1, needle);
}
int main()
{
int array_odd[] = {1, 3, 4, 7, 9, 11, 102};
int array_even[] = {1, 3, 4, 7, 8, 9, 11, 102};
int index;
for (index = 0; index < 7; index++)
{
assert(binary_search(array_odd, 7, array_odd[index]) == index);
}
assert(binary_search(array_odd, 7, 1337) == -1);
assert(binary_search(array_odd, 7, -5) == -1);
for (index = 0; index < 8; index++)
{
assert(binary_search(array_even, 8, array_even[index]) == index);
}
assert(binary_search(array_even, 8, 1337) == -1);
assert(binary_search(array_even, 8, -5) == -1);
printf(“All tests pass.\n”);
return 0;
}
Output:
mooneer@voldemort:~$ ./bsearch
All tests pass.
mooneer@voldemort:~$
After 15 minutes: http://gist.github.com/371501
I wouldn’t be quite this defensive if I didn’t know of the problem’s reputation.
I haven’t bothered with local variables or default arguments.
So, my results? Well, I failed. I got the logic right, but got bitten by an arcane syntactical rule… because I chose a language I could test easily, but in which I am not quite fluent.
In fairness, though, it’s five years since I coded for a living.
My first crack at bsearch in Perl, and it seems to work. Compares strings only, for brevity.
# bsearch($elt, $arrayref) performs a binary search on an array of
# sorted strings. returns element index if $elt is in @$arrayref,
# undef otherwise.
sub bsearch {
my ($elt, $arrayref, $min_idx, $max_idx) = @_;
my $nelts = scalar @$arrayref;
return undef unless $nelts > 0;
return undef if $min_idx > $max_idx;
$min_idx = 0 unless defined $min_idx;
$max_idx = $nelts-1 unless defined $max_idx;
my $mid_idx = int(($max_idx - $min_idx) / 2) + $min_idx;
if ($elt eq $arrayref->[$mid_idx]) {
return $mid_idx;
}
elsif ($elt lt $arrayref->[$mid_idx]) {
return bsearch($elt, $arrayref, $min_idx, $mid_idx-1);
}
else {
return bsearch($elt, $arrayref, $mid_idx+1, $max_idx);
}
}
Gah, it stripped out the pre tag. http://pastebin.ca/1868406
Output is still “All tests pass.” :)
Unfortunately, i failed because i swapped a < for a <= accidently.
First try:
After short testing:
public static int search(int [] arr, int key){
if((arr.length == 0)||(arr ==null)){
return -1;
}
if(arr.length ==1){
if(arr[0] == key){
return 0;
}else{
return -1;
}
}
int left =0;
int right = arr.length-1;
int mid = (left+ right)/2;
boolean found = false;
while((left key){
left =mid+1;
mid = ((left+right)/2);
}else{
right = mid-1;
mid = ((left+right)/2);
}
}
if(found == false){
return -1;
}
return mid;
}
untested, php, w/ tail-recursion
recursion makes things like this easy
$mB) return 1;
return -1;
}
function binSearch($aData, $mSearchVal, $iStart = -1, $iEnd = -1) {
if ($iStart == -1) { $iStart = 0; $iEnd = count($aData) -1; }
switch($iEnd – $iStart) {
case 1:
if (0 == compare($aData[$iEnd], $mSearchVal)) return $iEnd;
case 0:
if (0 == compare($aData[$iStart], $mSearchVal)) return $iStart;
return false;
default:
$iMidP = ($iStart + $iEnd) / 2;
if (0 == compare($aData[$iMidP], $mSearchVal)) return $iMidP;
if (0
in python (no testing done):
def search(n, l):
H = len(l) – 1
L = 0
M = int(H / 2)
while H – L > 0 and n != l[M]:
if n > l[M]:
L = M
else:
H = M
M = int((H + L) / 2)
if n == l[M]:
return M
return -1
Just a simple, recursive Ruby solution. Hopefully correct too :-)
C, 10 minutes, no testing, fingers crossed…
[
int binary_search(int target,int* list,int start,int end) {
if(start==end)return -1;
if(list[start]==target)return start;
int mid=start+(end-start)/2;
if(list[mid]<target)return binary_search(target,list,mid+1,end);
else return binary_search(target,list,start,mid+1);
}
int bsearch(int target,int* list,int len) {
return binary_search(target,list,0,len);
}
]
Wrote it first, tested it afterwards, posting finally. As far as I can test it, no bugs found, and recognizes unavailable elements correctly. *phew*. Wrote and checked the code in 20 minutes or so. I Used Java List instead of arrays, since it’s easier to sort in my tests. But there’s no magical properties…
Did it recursively. Returns the index in which the element is, or throws a suitable exception if not found.
public static <T extends Comparable> int binarySearch(final T needle,
final List haystack) throws ElementNotFoundException {
return binarySearch(needle, haystack, 0, haystack.size() – 1);
}
private static <T extends Comparable> int binarySearch(final T needle,
final List haystack, final int left, final int right)
throws ElementNotFoundException {
if (left > right) {
throw new ElementNotFoundException();
}
final int center = ((right – left) / 2) + left;
final T elementAtCenter = haystack.get(center);
final int comparisonResult = elementAtCenter.compareTo(needle);
if (comparisonResult 0) {
return binarySearch(needle, haystack, left, center – 1);
} else {
return center;
}
}
Haven’t tested it at all:
I screwed it up. I tried to anticipate Python and integer-division rules (when ignoring them does the right thing instead), and so got infinite recursion.
Two minutes longer than it should have taken me. Blargh.
Ngh… formatting fail. Sorry about that. I didn’t know how to format the code properly :(
No testing done, except for making it compile right. Bash away :)
http://www.mistcat.com/binary_search.html
I can’t figure out your crazy wordpress code insertion magic…
php, w/ tail-recursion, now tested, worked first time
recursion makes things like this easy
previous html formatting fail
<?php
function compare($mA, $mB) {
//modify as necessary comparing
if ($mA == $mB) return 0;
if ($mA > $mB) return 1;
return -1;
}
function binSearch($aData, $mSearchVal, $iStart = -1, $iEnd = -1) {
if ($iStart == -1) { $iStart = 0; $iEnd = count($aData) -1; }
switch($iEnd - $iStart) {
case 1:
if (0 == compare($aData[$iEnd], $mSearchVal)) return $iEnd;
case 0:
if (0 == compare($aData[$iStart], $mSearchVal)) return $iStart;
return false;
default:
$iMidP = ($iStart + $iEnd) / 2;
if (0 == compare($aData[$iMidP], $mSearchVal)) return $iMidP;
if (0 < compare($aData[$iMidP], $mSearchVal))
return binSearch($aData, $mSearchVal, $iStart, $iMidP - 1);
return binSearch($aData, $mSearchVal, $iMidP + 1, $iEnd);
}
}
var_dump(binSearch(array(0, 5, 22, 412, 1234, 2134, 5432), 5));
?>
time taken: 8 minutes in ruby
bugs: at least 2 illuminated from comments above.
i guess i haven’t written a binary search since i was in algorithms class 15 years ago. no recollection as to whether that one turned out to be correct,.
at least i failed quickly?
If I wrote this right, it has the added advantage of always returning the smallest correct index (so a search for 3 in [2, 3, 3, 3, 3, 3, 3, 4] will return 1).
int binsearch(int *arr, int len, int search) {
int *mid, at, left_len, right_len, val;
if (len <= 1) {
if (len == 1 && *arr == search) return 0;
return -1;
}
left_len = len/2;
right_len = len – left_len;
mid = arr + left_len;
val = *mid;
if (search <= val) {
at = binsearch(arr, left_len, search);
if (at = 0) at += left_len;
}
return at;
}
Python. I did test it, but I swear that I did not alter it after testing.
Python, iterative, passed all my tests so far (I’m probably missing something, though):
def bsearch(nums, item):
while nums:
mid = len(nums) / 2
if nums[mid] > item:
nums = nums[:mid]
elif nums[mid] < item:
nums = nums[mid+1:]
else:
return True
return False
I predict that more than 10% of programmers who read programming blogs will get this right.
def binary_search(array,val,lowval=0,highval=array.size-1)
searchval = lowval + (highval-lowval)/2
if val > array[searchval]
lowval = searchval
binary_search(array,val,lowval,highval)
elsif val < array[searchval]
highval = searchval
binary_search(array,val,lowval,highval)
else
puts searchval
end
end
Untested:
Emacs Lisp. No bugs found during testing.
(defun bsearch (vec elt)
“binary search VEC for ELT, returning index, or -1 if not found”
(bsearch-impl vec elt 0 (- (length vec) 1)))
(defun bsearch-impl (vec target left right)
(let ((mid (/ (+ left right) 2)))
(cond ((eq left right)
(if (eq target (elt vec left))
left
-1))
(( (elt vec mid)
(bsearch-impl vec target (+ 1 mid) right)))))
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gistfile1.groovy
hosted with ❤ by GitHub
Silly HTML eating my formatting . . . Still a success tho’.
Another one with Javascript, recursive.
After writing it, I did some rough testing and seems OK (but I may be wrong).
The surprising part it’s I’ve chosen Javascript. I need to think about it.
Tried to be too clever,
doesn’t go into an infinite loop…
[
def bsearch(sorted_array, given)
midpoint = sorted_array.length / 2
check = sorted_array[midpoint]
return check if given == check
return nil if sorted_array.length <= 1
unless given < check
bsearch sorted_array[(midpoint + 1)…sorted_array.length], given
else
besearch sorted_arry[0…midpoint], given
end
]
2nd attempt, possibly with indentation this time. Python, iterative, passed all my tests so far.
I predict that more than 10% of programmers who read programming blogs will get this right. Apart from the formatting in comments.
I think it ate part of my post.
Here’s an updated post now that I can read how to mark up the code s.t. wordpress is happy.
Time taken: 8 minutes
Result: failed. My original attempt had “end = mid – 1″ instead of “end = mid”. That’s the only change I made from the original version.
Total testing/fixing time: 5 minutes
So 13 minutes total. After the off-by-one I mentioned above was fixed, it passed my tests and the other regression tests other people have posted in C… but still, my first version was a failure, so I fail the test. :)
/**
* Returns index if found, -1 otherwise.
*/
int binarySearch(int[] array, int searched) {
int start = 0, end = array.length;
while (start searched) begin = middle + 1;
else end = middle - 1;
}
return -1;
}
Took me about half an hour. I’m in second year computer science and was introduced the algorithm last year, tough I never wrote one.
For fun, I wrote mine in Delphi:
looked again, forgot to enforce typing
change line 22:
$iMidP = ($iStart + $iEnd) / 2;
to:
$iMidP = (int)(($iStart + $iEnd) / 2);
def binsearch(v, lst):
“””Find a value v in a sorted list lst, using a binary search algorithm.
“””
length = len(lst):
if length == 0:
return False
elif length == 1:
if lst[0] == v:
return True
else:
return False
else :
if not length%2:
tmp = length/2 + 1
else:
tmp = length/2
tmp_v = lst[tmp]
if tmp_v == v:
return True
if tmp_v < v:
return binsearch(v, lst[tmp:])
else:
return binsearch(v, lst[:tmp])
This was my original version, and I've found 1 bug, my +1 is on the wrong branch.
The no testing clause was a killer on this, because I'd normally do something like this with unit tests via TDD. So, I did not get a working version in my first cut, but would have if I had written it as I normally write code.
I had an off-by-one error for the new upper/lower bound, but other than that, I got it.
(ok the hmtl code tag was a bad idea apparently)
/**
* Returns index if found, -1 otherwise.
*/
int binarySearch(int[] array, int searched) {
int start = 0, end = array.length;
while (start searched) begin = middle + 1;
else end = middle - 1;
}
return -1;
}
Got my loop comparison wrong. I wrote
while I should have written
Here’s mine, untested:
This is not the way I wrote it when assigned this 10 years ago in college (I’m pretty sure I had the non-recursive loop with 3 comparisons), but it is the way we were shown to do it after turning it in.
I’ve never had to do anything like this in my career to date, for whatever reason (probably because I’m a java developer working on web based apps).
Fail.
Coded it in 10 minutes, not thinking enough, and forgot the +1/-1. Everything else is almost exactly like in the Google example. I incidentally even took care of index overflows :)
Well, I’m clearly not ready to join the ranks of the great and good just yet… I got my “greater than” and “less than” the wrong way around. That amended, here it is, tested, as a Smalltalk method definition:
binarySearchFor: target in: array range: interval
| midIndex midValue |
interval size = 1 ifTrue: [
(array at: interval first) = target
ifTrue: [^interval first]
ifFalse: [^nil]
].
midIndex := interval first + (interval size // 2).
midValue := array at: midIndex.
midValue = target ifTrue: [
^midIndex
].
midValue > target ifTrue: [
^self binarySearchFor: target in: array range: (
interval first to: (midIndex – 1 max: interval first)
)
].
midValue < target ifTrue: [
^self binarySearchFor: target in: array range: (
(midIndex + 1 min: interval last) to: interval last
)
].
Not tested.
…
def search(list, val, start=0, end=None):
if not end: end = len(list)-1
if start>end:
return -1
middle = (end+start) / 2
if list[middle] == val:
return middle
if list[middle] val:
return search(list, val, start=start, end=middle-1)
…
I did manual testing by mentally stepping through my pseudocode with a few test inputs, which identified a couple of bugs that I fixed. Hope that is not cheating! After I decided that pseudocode was correct, here it is translated into Python:
Not tested:
First attempt. C#. Worked in every test I could think of:
public static int? BinSearch(IEnumerable src, int target)
{
if (!src.Any())
return null;
var n = (int)Math.Floor(src.Count() / 2d);
var mid = src.ElementAt(n);
if (mid == target)
return mid;
if (mid < target)
return BinSearch(src.Skip(n+1), target);
return BinSearch(src.Take(n), target);
}
You didn’t seem to provide an email to contact you with, so I’ll comment here.
I wrote my response in Python, and there did turn out to be two bugs. One, I mixed up the label for the length of the incoming list and the label for the point to be searched, and two I forgot divide the length in half when I recursed. Ah well. Got it right in about ten minutes though, including the testing.
def bsearch(srtd,x):
l = len(srtd)
if l == 0:
return False
med = srtd[l/2]
print med
if med == x:
return True
if x < med:
return bsearch(srtd[:(l/2)],x)
else:
return bsearch(srtd[(l/2)+1:],x)
I started writing as soon as I saw “Try It”, so I tested it twice before really thinking “Okay it’s done”.
def find(value, ary)
subarray_find(value, ary, 0, ary.length-1)
end
def subarray_find(value, ary, bottom, top)
if top < bottom
return -1
end
pivot_loc = ((top+bottom)/2.0).floor
pivot_value = ary[pivot_loc]
if pivot_value == value
return pivot_loc
elsif pivot_value value
return subarray_find(value, ary, bottom, pivot_loc-1)
end
end
presorted_array = [0, 1, 2, 3, 4, 5, 6, 9, 11, 25]
presorted_array.each do |v|
puts find(v, presorted_array) #should result in printing 0-9
end
puts find(17, presorted_array) #-1
puts find(17, []) #-1
Here’s my impl in Clojure. Not tested. Takes a custom search function. Is susceptible to numerical overflow, but if you have that many elements in an in-memory data structure, you have other problems.
I’ll test it later on today if I get time. Is there a standard set of test conditions, anywhere? I’d hate to “pass” simply because I forgot an edge case in my test.
Here’s mine. I THINK it’s right, but hell if I know. No overflow bug. Using python.
# This MAY work in python 2.6, not sure. Tested on 3.1
########
# Binary searchy! Returns the index of the found element.
def bsearch(data, toFind):
begin = 0
end = len(data) – 1
while begin < (end – 1):
pivot = int(begin + (end – begin) / 2)
if data[pivot] == toFind:
return pivot
elif data[pivot] toFind:
end = pivot
if data[begin] == toFind:
return begin
elif data[end] == toFind:
return end
return -1
in C, it took me 1 hour with all the cosmetics (randomly initializing an array or user-provided size and sorting it…). I think the core function took me about 20 minutes. Damn, so much longer than I thought.
It seems to work at first glance. Here’s the core search function:
int binary_search(int *array, int begin, int end, int searchvalue)
{
int idx = 0;
if (end-begin == 0) return -1;
if (end-begin == 1) {
if (array[begin] == searchvalue) return begin;
else return -1;
}
idx = begin+(end-begin)/2;
if (array[idx] == searchvalue) return idx;
if (searchvalue < array[idx]) return binary_search(array, begin, idx, searchvalue);
else return binary_search(array, idx, end, searchvalue);
}
First try failed when the target was outside the range of the array. Second try was successful.
Success!
Eh, screwed up my formatting. Here’s my successful binary search as a GitHub gist:
I wrote this recursive version:
It took about 10 minutes. I expected it to be correct. I then ran this test:
And got this result:
Wept a little, and corrected the 2 (two!) bugs, which were extremely obvious after the fact, namely that the greater-than should be less-than and that the index returned is wrong because I forgot to add the pivot when splitting. The end result is this:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Monad
binsearch = ((head . filter ((1 ==) . length)) .) . iterate . listHalf
where
listHalf = join . (`ap` halfway) . dropTake
dropTake needle haystack = if ((haystack !! halfway haystack) <= needle) then drop else take
halfway = flip div 2 .length
Time to flex my ruby:
I hope the formatting didn’t fail. Wrote once, tested (worked) then reformatted into the above for fun.
It screwed with my formatting too. Here’s a fixed one:
http://gist.github.com/371583
Untested…..
Wrote this and thought through some test cases on paper. Took about 40 min, over which time I also pared it down from about 2x as long. Tested using arrays of 0,1,2,3 elements with search items hitting each as well as missing below and between each.
[pre]
def binarySearch(A, t):
a, b = 0, len(A)-1
while b >= a:
mp = a + (b-a)/2
if A[mp] == t:
return t
elif A[mp] < t:
a = mp+1
else:
b = mp-1
return None
[/pre]
Woohoo! I did it!
Of course, that assumes I implemented the testing algorithm correctly as well…
I failed
Below is the first version I came up with that I think will work. I’m going to try my luck and not test it :) I know this could be simplified further, but for science I’ll leave it like this.
# vim:tabstop=8:shiftwidth=4:smarttab:expandtab:softtabstop=4:autoindent:
# Python 2.5.4
# Non-obvious ends of blocks have been indicated with comments, in
# case the indents get lost when posting as a comment on the blog.
# This code has only been syntax checked.
class BinarySearch:
def __init__(self):
”’ Initialize ”’
def search(self, num, list):
list_length = len(list)
lower_bound = 0
upper_bound = list_length – 1
found_position = -1
done = False
while not done:
if upper_bound == lower_bound:
done = True
if num == list[upper_bound]:
found_position = upper_bound
#fi
else:
middle_point = lower_bound + int((upper_bound + 1 – lower_bound) / 2)
if num == list[middle_point]:
found_position = middle_point
done = True
elif num > list[middle_point]:
lower_bound = middle_point
if lower_bound lower_bound:
upper_bound -= 1
#fi
#fi
#fi
#elihw
return found_position
#fed
if __name__ == ‘__main__’:
bs = BinarySearch()
for list_length in range(1, 10):
list = range(0, list_length)
for num in list:
pass
#print num, list, bs.search(num, list)
#rof
#rof
Hell, if we’re posting without testing, I might as well go big.
works (as far as i can tell with some quick testing)
In Haskell
Next, describe an algorithm to implement an insertion sort where you insert data into the array, in the correct position so that the array doesn’t need to be sorted before use. Finally, alter that algorithm for dealing with inserting already sorted data into the array. As someone once said, been there, done that… :-)
In any case, these algorithms appear simple on the face of it, but implementing them correctly can take a considerable of time and doh!
BTW, you only mentioned not cheating by use of bsearch() – what about qsort()? :-)
Never mind! This a search, not a sorting problem… (Sound of Homer Simpson slapping his forhead) Doh!
Failed before testing:
def search(array, value, bottom = 0, top = nil)
top ||= (array.length – 1)
middle = (top + bottom) / 2
if array[middle] == value
return middle
elsif top == bottom
return nil
elsif value < array[middle]
return search(array, value, bottom, middle)
else
return search(array, value, middle, top)
end
end
After testing:
def search(array, value, bottom = 0, top = nil)
top ||= (array.length – 1)
middle = (top + bottom) / 2
if array[middle] == value
return middle
elsif top <= bottom
return nil
elsif value < array[middle]
return search(array, value, bottom, middle – 1)
else
return search(array, value, middle + 1, top)
end
end
Failed. Wrote a recursive implementation that returned the offset:
…but didn’t track the offset of splits, so it failed as soon as it recursed.
Seems to work correctly on the first try. Having looked through the comments, it appears my code is virtually identical to the stuff by Lawrence Kesteloot, except in a different language. Crazy.
So with no testing I was 95% there…i think.
But who writes anything without testing to find there errors? It’s an unreasonable expectation to be able to write anything more than the most trivial program without having errors prior to testing.
Here’s mine. I don’t know if this is good Lua style because I don’t know Lua very well.
It could be complained that perhaps the routine ought to return the index at which the value would need to be inserted, rather than nil, on not-found. I didn’t think of that until after I tested it.
It’s not my first time, but it’s been a few years.
After writing this I looked at some of the comments. One that I thought was particularly interesting was @Juanjo’s, which I think will infinite-loop if you look for, say, 3 in the one-element array [2]. Haven’t tested that, though.
In python, as yet untested:
{source}
def bsearch(search_value, input_array):
“””
Assume array is sorted, min at [0] and max at [-1]
“””
# Ending conditions
if len(input_array) <= 0:
return False
if search_value input_array[-1]:
return False
split_index = int(math.floor(len(input_array) / 2))
if input_array[split_index] == search_value:
return True
if search_value > input_array[split_index]:
return bsearch(search_value, input_array[split_index+1:])
else:
return bsearch(search_value, input_array[0:split_index])
{/source}
Off to write unit tests and see….
Ooops… RTFM fail.
Tried it in Python and failed because of a typo. Once I corrected the typo, the rest was correct.
far from perfect. 15 minutes. doesn’t work when item is not in array (stack overflow)
Yup when in doubt I reach for my Perl hammer. It’s sad really. It’s worked for all the tests I’ve run but I’m sure I’m missing something.
Ruby solution – Technically a fail due to if/else syntax error (ruby’s still a new language), but my algorithm implementation was sound…
Returns the array index of the element, if found, or -1 if not. (This seemed the only sensible return value to me… why return a number you already know?)
Here’s my attempt in Clojure. I wrote it and subsequently tested it. I think it’s working. I may be wrong.
Here’s a recursive pseudocode. Looks like I almost wrote in Python.
bsearch (T element, lst) returns index of element if in sorted list lst, otherwise throws. Assumes element has operator
if element is null: throw NullElement // assuming comparison operator can’t be defined for null
if lst is null: throw NullList
int len = sorted_list.length
if len == 0: throw EmptyList
int n = len / 2
T curr = sorted_list[n]
// need to bottom out!
if curr == element:
return n
else if len == 1:
throw ElementNotFound
else if len == 2 && curr < element:
throw ElementNotFound
else if curr element: // search down
return bsearch(element, lst[(0)..(n-1)])
Steve Witham: yes, the <pre> tag does sort of work in WordPress comments, but it won’t do all the things you’d want it to do; in particular, it throws away indentation, which is pretty critical in code samples (especially if you write Python). Instead, use {source}…{/source}, but use square brackets instead of curlies.
public static int binarySearch(int[] array, int begin, int end, int value) {
if (begin array.length – 1) throw new ArrayIndexOutOfBoundsException();
int high = end;
int low = begin;
int mid = low + ((high – low) / 2);
if (value == array[mid]) return mid;
if (value > array[mid]) return binarySearch(array, mid + 1, high, value);
if (value < array[mid]) return binarySearch(array, low, mid – 1, value);
return -1;
}
public static int binarySearch(int[] array, int value) {
return binarySearch(array, 0, array.length – 1, value);
}
Nope, I was wrong.
Here’s my first pass in ruby I think it works
I failed.
Wrote it in Ruby, ran it. Buggy.
Here’s the version I ended up getting working.
http://pastebin.com/rAwy1WcJ
Update after testing and seeing the blog update about source formatting.
Argh, looks like my paste failed to get my search up case. Not sure why.
Josh Bloch has already discussed Binary Search in connection with JDK and Programming Pearls in a great detail here http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
[Mike says: yes, Nabeel. You’d have thought I’d have linked to that article, wouldn’t you? Oh! Wait! …]
Doh! Of course I missed the point about returning the index of item in the array (which the description from the book doesn’t really point out). After a bit of thought here was my updated version to do that.
Third and final try, lt/gt seems to be the problem. Incidentally “preview comment” would have been nice.
Yay, passed with this C:
Iterative python, seems like a lot of people did the same. Once I remembered how the range of an array is specified, think it is OK. Caught the index out of bounds condition that others seemed to miss. Wouldn’t take a bet that something else is missing though!
def bshalf(array, N):
if len(array)==1:
if N==array[0]:
print ‘found’, N
return 1
else:
print ‘not in array, in bounds’
return 0
halflenarr=int(len(array)/2)
print halflenarr, len(array), array
if N>array[halflenarr]:
return bshalf(array[halflenarr+2: len(array)], N)
else:
return bshalf(array[0: halflenarr+2], N)
def bs(array, N):
lenarr=len(array)
if N>array[lenarr-1]:
print ‘outside initial array’
return 0
return bshalf(array, N)
Javascript, iterative, seems to work.
I did right before testing it, but it was a bit harder than I thought it should be and wasn’t the most elegant code.
Forgot a “:” on else, otherwise right.
Cool Post!
Almost correct in python on first try…. It would return the correct index, and would handle out-of-bounds conditions, but with a search value in the range of but not in the list I hit a loop. I fixed it on the 2nd try. You can decide whether to count it as a success or fail.
// c# implementation of binary search
public int BinarySearch( int[] array, int index)
{
int start = 0;
int end = array.Length;
while(start < end)
{
int mid = (start + end)/2;
if(index < array[mid])
{
end = mid;
}
else start = mid + 1;
}
return start;
// you could also return end because theoretically when the loop ends start and end will be the same spot in the arrray :)
}
Code and unittests (all pass!) at http://pastebin.com/Rjr6wx1b
ruby-esque pseudocode, untested:
bsearch(arr,val){
if(arr.length==0)
return nil;
else if(arr.length==1 and val!=arr[0])
return nil;
n = (arr.length-1)/2
if(arr[n]==val)
return n
else if(arr[n]>val)
return bsearch(arr[0:n-1],val)
else
return bsearch(arr[n+1:arr.length-1],val)
}
Success!!! took me 40 minutes.
Whipped this up in Obj-C. Haven’t even compiled it, much less tested it. But I believe it should work (shouldn’t even have the integer overflow bug documented in that link, and yes I wrote the code before following the link).
Updated (and simpler) test section for my implementation, taking into account the need (mentioned in other comments) to test for numbers outside the range of values in the list and to test for numbers that are in the range, but not in the list:
Pingback: Am I one of the 10% of programmers who can write a binary search? « Michael Conigliaro
i posted my solution here: http://conigliaro.org/2010/04/19/am-i-one-of-the-10-of-programmers-who-can-write-a-binary-search/
be gentle. i haven’t been a full time programmer in quite a few years. ;-)
I gave this a go, and did some preliminary testing. It appears to function properly. Had to brush back up on my c skillz though.
Sorry, let me try that again:
I was confident in my code until I was about to write the first testcase and realized that I’d forgotten to test for an array size of zero, but otherwise bugfree. Unlike most of the code posted here, I tested the target value against the first and last entry before doing a divide and conquer loop, since the definition states that the range being tested is the range in which the value must lie.
it’s ruby, not tested. not elegant, but it “should” work.
Failed and took forever – now I hate my life — even more.
Here’s my go. Took me about 20 minutes. I guess I technically failed as it didn’t work on my first test, but it was really just a typo (I had first/last reverse on line 13). After fixing that, it seems to work fine:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bsearch.rb
hosted with ❤ by GitHub
It would be helpful if you had a test set of searches that could be used to verify correctness.
I know it’s right, I don’t *need* to test it.
Found a bug after adding the following two test conditions:
Fails on an empty list:
Commented out the first of the two new tests and the second new test also fails:
Works for all my test cases.
Don’t flame me for using java.
LOL @ Emo
… who I assume is joking …
:-)
Brad and others have expressed a wish that I’d provided a set of testcases. Unfortunately, to be useful such a set would need to be made available in a language usable by the code being tested, and as at least a dozen different languages have been used that’s not really feasible. Maybe I should have picked a lowest-common-denominator language like Java and mandated that … but I am glad I didn’t. It’s done my heart good to see the range of languages used here.
Untested
I fail, after I submitted I realized I didnt account for the empty array case. The rest was correct
Added two lines to the top of the search method:
All of my tests pass now.
Tested, but no corrections seemed necessary. Of course, I iterated through the three examples at the bottom of my code by hand. Not my prettiest code, but it handles the major cases. Took me about 35 minutes, including testing. The code was written in about 20 minutes with two bug fixes discovered during hand iteration (needed to add bot to mid if lowering the upper bound and mid to mid if raising the lower bound). I’ve not tried it with a larger data set.
Pingback: Are you one of the 10% of programmers who can write a binary search? « The Reinvigorated Programmer - Viewsflow
C#
C#
@Mike – you could just provide test cases in pseudocode and leave it up to us to convert it to our language.
untested and unrun. *fingers crossed*
Success!
“Mike – you could just provide test cases in pseudocode and leave it up to us to convert it to our language.”
I guess. But in part, too, I deliberately held off because I know that some of the test cases will immediately show people aspects of the problem that I wanted to see whether they’d spot for themselves. One obvious example is the zero-length-array test case: as Kernighan Pike say, “Make sure your code “does nothing” gracefully”. And, sure enough, a few commenters have been brave and honest enough to admit that they overlooked that case.
There have been so many interesting comments here (and on Reddit and Hacker News) that I have lots of material for a followup article, probably to be posted tomorrow. That might include some abstract test-cases.
Ok so I did this one in ColdFusion and I was pretty lazy but used recursion. Since there is no “array split” type function in CF I also wrote one of those to make my life easier. I have not tested it; beyond making sure CF doesn’t find a syntax error
Hi, did not read to the end and tested before posting. After posting I noticed the out of bound issues, guess that makes me part of the majority.
def binary_search(a, r, T):
while r != [] and r[0] != T:
print r
if T < mean(r):
r = [r[0], mean(r)]
else:
r = [mean(r), r[1]]
return r[0]
This had a very silly off-by-one ((count v) instead of (dec (count v)) for the initial hi), so I failed. Seems to work now, though.
Success!!! took me 40 minutes. I didn’t want to cheat, so I read other comments after I made it, turns out you are supposed to use squares not curlies (whoops)
My submission:
Seems to work…though it’ll report incorrectly if end < start. I thought about that condition after I ran the test (but without checking for the case or failing the test) so I'm not sure if that counts as a failure or not…given the way the code works, it should be an impossible input anyways but I figured it should be complete in isolation, without relying on the usage in main to be right (can't do much about whether the array is ordered or the len is specified correctly though…gotta love pointers, eh?).
I see I also forgot to test the empty array case. So line 25 changes to
I *think* mine is correct.
Did it in about 20 minutes in haXe, followed all your rules to the letter.
Success, I think. SBCL’s compiler found a typo bug for me, but it worked the first time I tested it. I haven’t heavily tested it though. For large arrays (log(n) bigger than stack), an optimize declaration is required on bsearch to get most lisps to perform a TCO. I was aware of this before testing, and have only omitted it to make the code look cleaner.
The 90% error rate doesn’t surprise me at all. Where I work, we have a rule of thumb: “If it hasn’t been tested, it doesn’t work” It is right a lot more than 90% of the time.
If I got this right, it’s only because I’ve done it so many times before.
Ops, I forgot to write the number of tries:
4 tries / 20 minutes
untested:
have array x() with elements
function to return index i of array x() which matches x(i) = y, else return -1
zero based array
x() is sorted such that x(i) <= x(i+1) for valid indices i, i+1
function find(y)
n = x.length();
min = 0;
max = n – 1;
do while (min < max)
test = (min + max) / 2
if x(test) y then
max = test – 1
else
return test
end if
loop
return -1
end function
And another version in whitespace that worked right from the scratch:
”
“
PS: time to complete was 18 minutes, 21 seconds according to git.
Tada! My first ever python code. I tested it with 20000 randomly generated arrays, but I’m still not confident.
Code at (Annie // April 19, 2010 at 9:39 pm) doesn’t compile,
the line
T *middle = (first+last)/2;
should be
T *middle = first + (last-first)/2;
some of you need to read …
http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
18 min in C++. Works with STL random iterators and with any type T which is LessThanComparable.
Mine works according to the included test. Very nearly messed up the midpoint calculation but caught it at the last minute before I ran the test.
def binary_search(a,target):
def search_range(begin,end):
if end-begin == 0:
return -1
if end-begin == 1:
if a[begin] == target:
return begin
else:
return -1
middle = (end+begin)/2
if target < a[middle]:
return search_range(begin,middle)
else:
return search_range(middle,end)
return search_range(0,len(a))
def test_search(a,target):
ix = binary_search(a,target)
if target in a:
assert ix == a.index(target), "failed for target == %d" % target
print target, ix
else:
assert ix == -1, "failed for target == %d" %target
print target, ix
a = range(0,100,3)
for t in range(-2,105):
test_search(a,t)
(self smack for not reading markup instructions properly, this is my second attempt at posting)
I prioritized legibility, went with recursion, had one bug that I fixed (I wasn’t checking array max index). Nice idea.
It did work the first time I ran it, much to my delight. However, I'm pretty sure that without the problem setup, designed to induce extreme paranoia, I would have failed. It's easy to see why 90% of the people in Bentley's test would have failed.
Yeah, infinite loop in some cases when the result’s not found. Three-second fix, but we’re going for first try, huh.
I had an off-by-one error when initializing the upper bound, outside of that it was a success.
I like the challenge, but what kind of a lunatic would write code before writing tests?
Haven’t tested it at all. Think I have all the edge cases. Needs golfing.
bah, close. i put ” on line 14. once i fixed that, everything else seems to work fine.
This was my untested code.
I got the calculation of index wrong, it should be index = lower + (upper – lower) / 2
wow, wordpress is a terrible medium for this exercise. this should really be on a forum where it’s possible to have threaded comments, because the comment section here has become a complete clusterf*ck.
Simple recursive Python version. Seems to work fine:
This was my javascript effort, seems to work okay:
I wont post the code because you have 100s of examples, but for the sake of your “poll” – I did it in Python, coded it, reviewed it and than ran it without a single bug (Well I did slack off on writing meaningful exception messages :)
Took me about 10 minutes.
I hope to see the results soon, although I fear that they will be badly skewed.
Failed really stupidily by returning the relative index (from the evaluated range) instead of the absolute index.
Changed the code, ran the tests again and everything worked correctly. Woe is me.
Used ruby, btw.
Fun stuff :D
phps, untested.
Wrote this in Ruby in about an hour. Since I couldn’t test before hand I made that final “if” block instead of actually figuring out what’s going on.
It passes the tests that I came up with.
def bsearch(item, l, offset=0):
if len(l) > 1:
idx = len(l)/2
p = l[idx]
if p == item:
return idx + offset
elif p item:
return bsearch(item, l[:idx], offset)
else:
if l[0] == item:
return offset
return None
python:
I’m 99.5% sure there’s no bug in the above. I was aware already of the integer overflow complaint, which is the only reason I didn’t use (a+b)/2.
A more interesting task would be to return the largest range [a,b) such that v[i]=T for a<=i<b
First Erlang implementation, it seems.
When I was scrolling down to post, I saw a comment that reminded me that I forgot about the 0-length case. Darn.
If I add a guard in bsearch() to return the not_found atom if the length of the list is 0, then I think I passed. At least, I’ve tried all the cases I can think of and it works fine.
Ok, here’s mine. 25min and completely untested, thus probably horribly wrong and/or buggy (I usually copy/paste even my email address because I tend to mistype it…):
Did it in Haskell in 5 minutes, stupidly left out one base case and found it with the second test case. I admire Matthias Goergens’s solution for being a bit more Haskell-ey than mine.
Try again, with square brackets this time.
In Python (2.6 flavor). Untested.
I failed.
Tested, no bugs found… yehaa :)
Well, it looks like, there will be much too much code to review ;)
It certainly was interesting. It certainly looks easy but there are some pitfalls.
I failed my first run because of two typos which resulted in a logic error, so I added a case tester to debug and correct. Seems to work for all the cases I could think of.
What constitutes testing? Just running the code? What about using pencil and paper to help you test it by eye? I wrote the whole thing, but found one obvious bug really quickly just by looking at it. Then I did some testing with an example array and used paper to help track values and found another bug.
I wrote up unit tests in the code then ran it. Output: “Done. Press any key.” Total time = 1 hour. Language = C#.
It may not be as elegant as other posted algorithms. Rather than just checking the midpoint, it also checks the start and end at the same time. It might make more comparisons than is necessary in some cases, but it also makes far fewer comparisons in other cases (where the value you want is in a very early or late position of a large array.) It’s probably a wash. It at least never compares the same position more than once.
I tested it against one array quickly…and it failed. Had two bugs, both 1 character long. First I initially passed the array by reference (so an extra &) in order not to have to copy the array, but I guess you can’t do that in PHP. Second, I accidentally substracted the end and starts to get the average instead of dividing.
Took me ~15 minutes. Sloppy code and not sure if it actually works.
Cool experiment.
Well… Test results: 1 type, 1 Ruby fail (who ever uses input from the terminal ever…), 1 forgotten args in the recursion (D’OH!). But the algorithm works quite fine (although I do *not* check the 0-array case).
Submitted without testing or reading the comments:
I got it right on the first try. Yay for the 10%.
Note that your results will skew for success anyway, for several reasons. One is that people who read programming blogs are skewing for the better end, but the more important thing is that some unsuccessful people will not post, no matter what you told them.
Fencepost error, used “pivot = min” instead of “pivot = min + 1”. Infinite loop if no match. Worked when I fixed it, no boundary errors, remembered to check for an empty array.
My implementation also fails on duplicate keys. Oops.
A couple of comments,
1. I put in a couple of tentative tests at entry the second is probaby not needed
2. I totally failed to return the actual index in original array
I tried rewriting recursively and messed up the indexing and completely missed using low/high to compress a window.
so I got it right but I think I failed.
tidied up my comments after testing.
20 mins.
I was able to do it, including the ‘overflow’ dangers mentioned on the Google blog. I used pointers instead of indexes so I made sure of this
So, tentatively, I want to say that I passed the test. Submission is above and here: http://gist.github.com/371735
Unfortunately, as noted in my gist, I don’t know what I don’t know! I might be iterating too many times, and I don’t know which other cases I should be testing against. Someone needs to put together a test suite for the numerous Python submissions!
Aaand there it is: Failure to read the specification. I only search for values in the array. Cheers for me, I’m part of the 90%.
I had one bug that prevented me from identifying the value stored at the end of the array i.e. the largest value. Fixing it was pretty simple.
I have the 2nd edition of the book and every year and a half or so I come across this very section and end up re-coding it. Usually, I make a careless error. Never takes more than a few minutes to fix.
Argh… my first call to splice should use $pos rather than $pos-1 because it’s the length of the sub-list rather than the position of the end of the sub-list.
Here’s my attempt at a recursive version in ruby.
I think my attempt works. I first wrote it as a “contains” check before I realized the point of the task was to find the right index. Because I didn’t think enough before patching it accordingly I actually created an off-by-one error.
I find the requirements a bit harsh, though, because I’ve made it a habit to code with a Python shell open next to my text editor and writing in both, testing my logic and assumptions as I write the code.
I guess I’m not one of the 10 percent then, if only because I test my code rather than mentally parsing it line for line to check for errors prior to testing.
{source}var arr = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19];
var count = 0;
function bsearch(arr, lookup)
{
console.log(arr);
count++;
if(count > 100)
return ‘endless recursion?’;
if(arr.length == 1)
return arr[0] == lookup ? arr[0] : false;
var midpoint = parseInt(arr.length / 2);
if(arr[midpoint] == lookup) {
return arr[midpoint];
} else if(arr[midpoint] > lookup) {
return bsearch(arr.slice(0, midpoint), lookup);
} else {
return bsearch(arr.slice(midpoint, arr.length), lookup);
}
}
console.log(bsearch(arr, 9));{/source}
Correction. My code did not work with an empty list.
Here is the corrected code.
[SOURCE]
template
RIt binary_search(const RIt &first, const RIt &last, const T &val)
{
RIt a = first;
RIt b = last;
for(;;)
{
RIt::difference_type diff = b-a;
if (diff==1)
{
if (*a==val) return a;
return last;
}
RIt c = a+diff/2;
if (val<*c) b = c;
else a = c;
}
}
[/SOURCE]
Works with any type T that is less than comparable. And it works with random access iterators (such as pointers or vector iterators).
This one is in javascript. Untested. 45 minutes (approx).
function binary_search(t,a) {
return binary_search_helper(t,a,0,a.length);
}
function binary_search_helper(t,a,start,end) {
/*
* assume start is untested
* assume end is also tested and does not contain the element
*/
if (start === end) {
return null;
}
/*
* only one element left to find
*/
if (end-start === 1) {
return t === a[start]?a[start]:null;
}
/*
* end-start is at least 2 (since it is neither 0 or 1
* from the conditions above), thus Math.floor((end-start)/2)
* >=1.
*
* this is important to prevent an infinite loop
*/
var middle = start + Math.floor((end-start)/2);
/*
* a[middle-1] makes sure we look for the first occurance
* middle>=1 since end-start is at least 2 and start is >=0
*/
if (a[middle] === t && a[middle-1] != t) {
return middle;
}
if (a[middle] <== t) {
return binary_search(t,a,middle+1,end);
}
/*
* also covers the case where a[middle-1] === t
* (i.e. we found the element but it's not the least
* element)
*/
return binary_search(t,a,start,middle);
}
Submitted without testing. The function returns the length of the array if the value cannot be found.
Third times a charm? Delete my other comments please.
I forgot to test for an empty list. So my code wasn’t correct the first time around as I had thought.
Works with any type T that is less than comparable. And it works with random access iterators (such as pointers or vector iterators).
[Mike says: if you really want me to, I will delete your other comments. But I would prefer to retain them, because your successive iterations are instructive.]
I failed. Got updating the pivot index wrong the first time.
My current code succeeds with my tests. (And is ugly…)
Time to first version: ~10 min
Time to fix bugs: ~20 min
The really humbling thing isn’t how hard it is to write correct code without testing, but how long it takes, even for a simple textbook algorithm. Binary search is the sort of thing that sounds like it should take five minutes, but I took 45, and I don’t think I was distracted for more than 5 minutes of that. At least a third of the time was due to the added complexity of usefully supporting arbitrary types and orderings, not just numbers in ascending order. Here it is (in Common Lisp):
It has passed all my tests so far. However, I’m not sure this counts entirely as a success, because I reinterpreted the requirements to make it easier: I originally intended to allow equivalent elements in the ordering, but when I realized that was awkward, I just gave up and declared that the ordering had to be total.
I’ll try my hand at this. Here’s the code *before* testing (python):
Let me know if I screwed it up! Pretty sure it works though…
I believe this code is correct . Ultimate shame on me if there are any bugs remaining.
PHP:
About 12-14 minutes.
PHP, didn’t check this at all, thought it would be fun.
here’s my solution, in Haskell:
This works properly. However, I must admit I initially failed. The mistake was completely stupid as well; in the line reading
I accidentally wrote this instead:
Which makes me feel quite stupid :) fortunately, that seems to be the only error I can find. Let me know if you find any other errors.
How about a nice Fortran implementation :-)
I ran a few small tests – seems to work ok. Writen in C.
int bin_search(int a[], size_t sz, int item)
{
int l_bound = 0;
int u_bound = sz-1;
int current;
if (item == a[l_bound]) return l_bound;
if (item == a[u_bound]) return u_bound;
while (l_bound + 1 < u_bound) {
current = (l_bound + u_bound)/2;
if (item == a[current]) return current;
else if (item < a[current]) u_bound = current;
else l_bound = current;
}
return -1;
}
Did half-assed testing (which passed). No code changes after testing
In Python, using slicing and tail recursion. Passes unit tests of all the usual suspects.
No guarantees of efficiency
Okay, I wrote it out on paper and worked out a few cases, then I just went out and wrote code + some tests. Probably has some bugs. And it’s ugly also. I fully admit that is a hack.
//tail recursion
int binSearch(int lo, int hi, int range, int theNumber, int sortedArray[])
{
int retIdx = -1;
if(range==2) //sort of whack…
{
if(sortedArray[lo] == theNumber)
return lo;
else if(sortedArray[hi] == theNumber)
return hi;
else
return -1; //not found
}
else if(range == 1)
{
if(sortedArray[lo] == theNumber)
return lo;
else
return -1;
}
int divider = lo + range/2;
if(sortedArray[divider] == theNumber)
return divider;
else if(theNumber < sortedArray[divider]) //recurse left
{
return binSearch(lo,divider-1,divider-lo,theNumber,sortedArray);
}
else //recurse right
{
return binSearch(divider+1,hi,hi-divider,theNumber,sortedArray);
}
}
Tests:
Wrong! But it’s very late and i’m coding while lying in my bed, not in the office :P
Okay, I came up with a nominally tested ruby implementation using recursion. Had an off by one error that led to infinite recursion, but I think that’s fixed now.
Okay, about to test. I should say I’ve written binary search a number of times in the past and got it wrong a number of ways and fixed it. So I have specific personal rules (which I won’t reveal at this time) about writing binary search!
But in this case I first wrote a very simple linear_search() function, then a tester, then a broken_search() function to make sure the tester actually catches the bugs it’s looking for. In that step I came across a bug I hadn’t tested for: not returning the ( i, nsteps ) tuple that the tester expects. So I must admit I had practice making and fixing that error. My code, alternate searchers, and tester are here. Note that I fleshed out Bentley’s spec to my own liking.
Matlab code:
B = floor(rand(50,1)*100)
T = 7
A = sort(B)
Found = 0
while (length(A)>0)
j = ceil(length(A)/2)
if (A(j) > T)
if (j ==1)
A = [];
else
A = A(1:(j-1));
end
elseif (A(j)<T))
if (j==length(A))
A = [];
else
A = A((j+1):length(A))
end
else
Found = 1;
A = [];
end
end
if (Found ==1)
display('found the target number')
else
display('target number not found')
end
will test it in a second :)
GO GO GADGET BINARY SEARCH FUNCTION:
int binarySearch(int* array, int arraySize, int value)
{
int currentMax = arraySize-1;
int currentMin = 0;
int index;
if( !array || arraySize <= 0 )
return -1;
while(1)
{
index = ((currentMax – currentMin) / 2) + currentMin;
if( *(array+index) == value )
return index;
else
{
if( *(array+index) < value )
{
currentMin = index + 1;
}
else
{
currentMax = index – 1;
}
}
if( currentMax < currentMin )
return -1;
}
}
A simple ruby script which take arguments from command line. The search is the first number, the following numbers are the array.
Okay, mine passes my tests!
I stupidly changed the name of the function after posting it to the comment box, but neglected to change the calls. Maybe that counts as failing?
Anyways, here’s the correct version. Also, I just use default arguments instead of a helper function.
public class BinSearch {
/**
* Search for b in sorted (ascending) array a
*/
public static int binSearch(int a[], int b) {
if(a == null || a.length == 0)
return -1;
int min = 0, max = a.length-1;
while(min <= max) {
int mid = (min+max+1)/2;
if(a[mid] b)
max = mid;
else
return mid;
}
return -1;
}
}
Here’s my attempt in Ruby. Haven’t tested it yet, but I’ll be doing that shortly.
Hey, sounds fun. I love a good challenge. I’m writing this in a text editor to avoid reading any previous comments that have come in since I read the article. Below is my implementation in Python 3 syntax (it “compiles” without errors). I have to say, it’s very hard not to test this before posting! I’m a bit of a novice, so I don’t expect this to be bug free. Go easy on me.
I had to look up the syntax of ldiff, and initially forgot to floor mdx. That got it running, but it turns out I forgot to handle values that aren’t found.
I realize using nthcdr and ldiff is probably extremely terrible.
Just some quick thoughts….It’s given that the length is either odd or even. Thus, as one recurse it eventually all broil down to an Odd middle index or and Even middle index.
Worked perfectly from the very start:
Untested… this was a good exercise!
Oops, it didn’t work on the empty array case. But other than that it worked.
Untested… this was a good exercise!
ECMAScript implementation, using a small recursive function.
I didn’t look at others source code, no looking at other binary search routines and no testing until it was done.
This code makes a small dictionary tree and searches for words you put in.
Some basic Haskell:
Matthias Goergens’ solution makes me realize that there is a lot of Haskell learning to do.
failure!
Here is a VB.NET version that is tested. I failed a few cases on the first try. Biggest mistake was an off by one error. This is a rewrite to remove all the unnecessary code and fix all the edge cases.
Tested, and seems to work. Got it first try.
Testing is part of software development. This challenge is a bit like asking artists to draw a perfect circle while blindfolded. It’s a neat parlor trick, but you can’t use it as a litmus test for a “real artist” or “real programmer.”
Anyhow; I saw that someone already posted an overly-generic C# IEnumerable solution, so I wasn’t going to post mine. But since the earlier solution uses the ElementAt extension method, mine is still notable in the “overkill” category.
untested, java
ok, so after about 25 minutes,
basic psuedo code is
return index if good
return go_left(lower, index-1) if less good
else return go_right(index+1, upper)
I have not tested the following code…
10 minutes later the above is typed in.
Looks like more specific psudocode, but is hopefully well formed python.
This assumes either that the value is unique, or that we don’t care which index is returned for multiple.
Mind you I’ve done this in school previously, so it was good reminder.
If I had more time I’d step through with a couple of cases. Maybe even test it in an actual interpreter.
http://pastebin.com/ASNCka0y
Here’s my attempt at a recursive D (D 2.0) version:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bsearch.d
hosted with ❤ by GitHub
It passes my tests (see the gist) but I’m sure I’ve missed an edge condition or four.
Can someone test mine? I’m too lazy to. -.-
Success–but I did it recursively, which I consider kind of a fail given the use case.
I notice I’ve misunderstood the task. I should have worked on an array and not a tree structure.
I’m going to do the “extra credit” macho option and post my solution before testing it. (I did run a syntax check, and I did mentally trace through a number of cases first.)
I wrote this from memory – but as I wrote an assembler version only a few weeks ago, the technique is still fresh in my mind – and specifically the use of array.size as the upper bound, rather than array.size-1.
Scanning through the other comments, seems like this one is very similar.
I failed – simple bug, but failure none the less.
However, I don’t really see a point in this exercise – are you(or the author of the book) implying that if a programmer can’t implement this on first try, without testing, he’s not as good as one who can?
IMO, it’s not a good metric for the ability of the programmer.
Now, if a programmer doesn’t _understand_ the binary search algorithm after reading it – then we might have a problem…
Tested and seems to work in ruby, looking for the bug after posting comment:
I think I passed, but I didn’t test.
Question, though: if we all know that only 10% of people passed this test, won’t we also try harder, thus skewing the results? A balanced experiment would be double-blind.
Thoroughly. I mean I didn’t test thoroughly. Just FYI.
Only problem I had when I did my first test was that I forgot to return the results of my recursive call to binary_search.
Test suite:
C++ code, not tested or compiled, returns either an index or -1 for not found. Pretty confident it’ll work, let me know if it doesn’t ;)
When I read “The only way you’ll believe this is by putting down this column right now and writing the code yourself. Try it.” I stopped reading and tried it.
My binary searches worked without error. Although the first test case I wrote made me think it didn’t because it passed in an unsorted array – and the item wasn’t found.
I think the lack of testing is bunk – who writes code without testing it? We need more people that write correct code because they’ve tested it fully. Not more people who submit code without testing because they think they’re one of the 10% who can do it.
Disclaimer:
I stole many of the tests by reading the comments.
Ps. I just realized that the not_found variable is redundant – but it doesn’t hurt anything aside from memory usage.
:P
Code plus tests. Code in less than 5 minutes + another 10 to write the tests and be sure that it did what it should do before running it.
I understand that one can slip a bug into something that is taken for granted, but come on…
Mine (posted above) seems to work. Since we were ignoring numeric overflow I went ahead and assumed we could ignore stack overflow as well. For a real library or to run on an embedded device (I like AVRs) I’d write an explicitly iterative version.
Does anyone have any tests I haven’t thought of?
Given the blog owner’s choice of Lisp language, I’m surprised that of the first ~300 entries, mine was the only Scheme one. Wow! :-) (I just grepped for “(define”, so someone correct me if I missed something.)
Ruby – NON Recursive
So perhaps it doesn’t count having written once already… but I wrote a new one, avoided the syntax errors that caught me up the first time, realized that recursion is going to eat a lot of unnecessary resources for large arrays, and ran correctly out of the box this time. :)
It’s interesting that all the common error cases (null array, element missing) get caught by the same test for @range being negative.
Untested, un-peeked (except for the non-overflowing mid calculation which was in one of the first solutions). I already know it doesn’t do the empty array test so, I fail it but, let hope the rest works…
Err… forgot to post the couple last lines :P
Forgot to account for a search for a non-existent element; so I failed :(
Works as far as I can tell…
This is interesting, a somewhat related post can be found here:
http://googleresearch.blogspot.com/2006/06/extra-extra-read-…
The gist of this is that version of BSearch implemented in the JDK contained a bug due to an overflow error. I recall reading about it some time ago, funny to see it come up again.
[Mike says: this is, I think, the third comment point out the Josh Bloch integer-overflow article as though I hadn’t linked to it from the original post itself. I can only assume these are being posted by people who’ve not actually read my post. I’m letting them all through moderation because they’re not spam and not abusive, but they really don’t add much to the discussion.]
uy, lo,hi = found instead of mid; IFI. But other than that mine’s good. Bonus points for elegance I say; not a +/- 1 to be found :)
I didn’t read the challenge until I tested my code. I only made one change after testing, and that did not affect correctness. (I had a special case for arrays of length 1.)
It turns out I instinctively avoided a bug that would have resulted in an infinite loop when checking a slice containing two elements. Lucky, I say.
Also in D.
First version had a bug in it, easily fixed though.
Looking through the comments I like the recursive solutions.
Does it mean you’re not a professional or decent programmer if your first version has a couple of bugs in it…? i don’t think so.
I’m not sure what this proves.
I think this is a bit of a sham. My guess is the author of the book is just overly picky when reviewing. My first run seg faulted :P but this was not a result of the function, but rather an argument reading error where I tried to read argv at argc instead of argc-1, so I don’t count that as my logic error as it was outside the bsearch function. All subsequent testing succeeded.
@Joe User (comment 1805): Applied my tests to your code and it failed in the first one, ie. looking for a value outside the range. Say you have a list [1, 3, 5, 7] and I look for 0… your code gets into an infinite loop
Works AFAIK
One last attempt to get word press to quit eating all my code :(
I know I just posted, but some of these implementations (assuming they passed testing) might be useful over at http://rosettacode.org/wiki/Binary_search if your language hasn’t been done yet.
Mine worked.
I created a binary tree (computing next node pointer during the search, not ahead of time). It uses recursion. It’s probably not the fastest, and with 3 classes it’s also the most lines that i’ve seen so far… but it works.
I did it in C++.
I failed, but I got it fixed in about 5 mins :)
Test cases are key. My first implementation worked for all values in the list, and values that fell inside the range of the list (> than least, greatest and < least test cases popped into my head. Had to add another check for the latter. So, does it count as a fail if you yell "Done!" then realize you missed something?
Ruby implementation came close. I’m pretty proud of that given that it was my first run ever, with no formal compsci training. Took three edits to match my big test suite :)
http://pastebin.org/160181
Though the do-without-testing concept is interesting, I’m not really sure it accurately represents programming prowess. If an application is being correctly designed, automated testing while working should be encouraged.
I *could* have spent half an hour poring over the code, running test cases myself, but instead I just ran the test suite, found the problem, and identified and fixed it within a minute. Writing an algorithm in one’s head, pass/fail, seems like an interesting test, but it not the mark of a good, productive programmer.
Or maybe I’m just whining since my 5-minute attempt didn’t pass. Ah well.
Writing code without testing is akin to doing math without a calculator.
This seems to work – first attempt.
Got mine working (as much as I can see, but that doesn’t say much…) after one quick bug in testing (instead the correct array position for the found (pos[mid]) i returned mid)… Any comments?
It won’t pass the integer overflow test (though at least it will throw an exception rather than silently failing), but here’s my answer in Clojure:
So that’s my “vote” — if we were supposed to protect against integer overflow I didn’t get it, otherwise I did.
“””
def bsearch(x, l):
print l
if len(l) == 1:
return x == l[0]
pivot = len(l)/2
val = l[pivot]
if x == val:
return True
elif x < val:
return bsearch(x, l[:pivot-1])
else:
return bsearch(x, l[pivot:])
"""
Jonathan Deutsch, I can’t decide whether your comment is trite or deep.
Arithmetic without a calculator borders on nonsense. But for analysis, algebra or proofs in general, you’d need a very advanced calculator.
Succeeded (at least on my test cases), in M. I assume array is zero-indexed.
(It may be worth noting that the right way to do this in M would be to use the values as the subscripts to the array, since arrays in M are actually more of a map structure.)
Hmm. So the question now is, can we trust programmers to tell the truth about how good they are?
I think the answer is clearly no.
Success under the specified conditions, but a slight failure in the enhancement I tried to make at the same time.
I tried to get the .NET semantics where the index is returned when the element is found, and the two’s complement of the index the element should have been otherwise. I rushed the two’s complement part and got the value slightly wrong, but the basic ‘is-it-there-or-not’ search was correct, including overflow safety provided my ‘a + ((b – a)/2)’ is the correct solution for a safe midpoint calculation… I didn’t have enough memory to determine whether there is a gotcha there ;)
Mine seems to work, at least for the handful of testcases I tried.
As requested, I report that I failed. I wrote
When it should have been:
Tested briefly, seems to work correctly. Assuming I can use the rule “You’re allowed to use your compiler to shake out mechanical bugs such as syntax errors or failure to initialise variables” to cover learning to write classes in Python (because it took me three tries to reference class variables correctly), it worked first try.
Wrote, tested, posted. Note that this is searching for a key-value pair with a given key.
How about proving it correct before even testing? I’ve used VCC http://vcc.codeplex.com/
To say only 10% can write a binary search is inaccurate. The rules being applied here are not realistic and rather contradictory to some development processes. For example:
NO testing until done writing? What about Test Driven Development?
Do you actually expect programmers to be able to write completely bug free code on their first run through?
I understand the basic premise: that programmers seemingly aren’t as good as we’d expect. But this feels like another apocalyptic assessment of how “kids” these days can’t code worth a damn. Not to say that all the high level coding with libraries degrading overall competency isn’t a worry.
I should mention that I fall in the kid age group, as I’m still in college. But I’m nearing the teenage years regarding overall experience and intuition.
To be honest I only tested it with one array, I spent most of my time making it small… I was hoping to get it in under 80 characters, but I’ve only managed 111.
Got this wrong on the first try, didn’t handle failure to find the target item. Can’t find any other bugs but let me know if you do:
In Go:
def bsearch(ar, v):
if len(ar)==0: return -1
if v ar[-1]: return -1
return bsint(ar, v, 0, len(ar))
def bsint(ar, v, f, t):
if f==t:
if ar[f]==v:
return f
else:
return -1
mid = (f+t)/2 # integer arithmetic
am = ar[mid]
if am==v: return mid
if vam: return bsint(ar, v, mid+1, t)
Wrong the first time.
Had less-than rather than less-than-or-equal-to as my loop-condition but other than that it worked in my tests.
Correct after fixing two syntax errors:
forgot to mention that I forced myself to do it in C95 without recursion…because it had been a while.
A file of “standard” test cases
I’ve written a file of 4096 tests of the following form:
You can find zipped and gzipped versions of the file, a more detailed explanation, and Python code to test your Python function against the file, here.
I’m amazed how many people thought Mike’s point or Bentley’s point was about not testing code. Of course you should test! The point is whether you can get it right without using testing as a crutch to get there before you test.
Developed and tested on the command line. I didn't get it right on the first run.
Got it wrong the first time (it was a small fix)–here’s the corrected version:
PHP bi_search – work for first test and every test after. should work for integers, floats, characters. No particular reason for language choice.
My above code has not been tested – C#.
It seems to work, except I stupidly forgot the increment in the second case the first time I tried it.
Argh, failed. 2 huge bugs. Shame upon me.
A try:
Python not tested or compiled:
I failed.
I would agree with several other posters that the implement without testing is a strange way to go about development. I could have found my problem if I sat and looked long enough, but instead I ran it, found the problem and fixed it in minutes. *shrug*
Found 1 bug as I tested the code (infinite loop). I needed to change teh condition to be <= 1 and not == 0.
Time : roughly 2 hours.
Status: Failed – initial attempt
Working as of second attempt – not thoroughly tested.
So I wrote my binary search in Perl and it worked at first go, just like it will for so many other readers of your blog. I doubt anyone’s day will be enhanced by me posting the code, so please take my word for it.
I think the reason why only circa 10% of programmers can write a functioning binary search routine is that only circa 10% of programmers have brains wired for implementing low-level algorithms.
Twenty years ago, “not being able to write low-level algorithms” was the same as saying “not able to program computers” but the past few decades of progress in software development have been aimed specifically at allowing the other 90% – the ones who can’t actually program – to produce useful software despite their handicap. This has been great for the 90%, and for the companies who employ them, but it’s come at a cost to programming culture.
Code at http://gist.github.com/371978
I actually wrote two implementations. The first was a completely naive recursion version, which worked perfectly.
The second was an iterative version of the first, but I made a stupid, horrible mistake. When I’d finished, I decided that my extra slice variable was redundant and I could just mutate the original array.
This wouldn’t have been a problem except that I’d chosen the sentinel value for “not in array” to be the length of the array as opposed to -1. Since I was now mutating the original array, my iterative version would always return index 0 for any element not in the array. WHOOPSIE.
That’ll teach me to try and be clever.
On the upside, I was pleasantly surprised when all the assertion failures for the recursive method turned out to be bugs in the tests.
Also on the up side is that neither function should be susceptible to the overflow bug by virtue of using D’s slicing syntax.
On another note, I think discounting the overflow bug is a bad idea: it’s a bug, end of story. It’s not hard to avoid, given the correct data structure (in this case, slices eliminate it entirely).
I did a unit test. Sorry.
I failed. I wrote the code (below) and ran through 5 simple test cases but it wouldn’t find the element if it was the last in the list.
I thought this was a great exercise. I haven’t used System.arrayCopy in forever, and 50% of my job is J2EE development!
Ahh, shouldn’t have made those three variables final and fixed my failure returns.
Now it should be guaranteed not to overflow and to work properly.
This sounds like a lot of fun, but it’s sort of a party novelty. Are you allowed to use backspace to delete, or is it a one shot: write it and release it? I don’t know if I could avoid hitting backspace. It’s rather horribly ingrained by now.
I don’t really see the point of writing code without testing. The whole point of programming is to write broken code and fix it. Usually I just create an empty source file and start debugging. That’s one of the reasons I like Realbasic. You create an empty program and run it, and up comes a trivial window and a menu with the quit command. It’s a very satisfying starting point, full of possibilities. Then I pop open a text editor and start the spec.
After all, the last few times I’ve written a binary search were to find an insertion point, not a particular element, and to search a very large, but relatively uniform database by estimating a “mid” point based on the extreme and search values. A old working binary search I had lying around made an excellent starting point. Just replace the axe blade, slip in a new handle, try a different blade, and voila, it’s all debugged.
doh! it should’ve been:
I’m going for extra credit here, posting my Python code before I test it. This isn’t the first time I’ve written a binary search, but it’s been a long time – at least 15 years as best as I can remember. I long ago learned the secret – make sure each pass is narrowing the search range by at least 1, otherwise you can get into an infinite loop.
To make this more interesting, I made it work with only a less-than operator, similar to the way the standard C++ library works.
I’m quite surprised that you didn’t include some test conditions, but I see a few comments above mine that at least one person has volunteered. Thanks!
Alrighty then, what are the test cases?
grr, how do I edit this again?
above source is what I ended up with (Ruby). I failed insofar as one of my test cases produced an endless recursion. I then decided to handle the trivial cases explicitly, with good results.
gist.github.com/371995
Failed on initial try in C with conditional silliness, took a couple more “oh, yeah, that” thoughts to finish.
Am one of the 90% – didn’t consider zero length input array. I’m ashamed to say I wrote it in IDL.
I didn’t have the stones to post it without trying it out but I did write it all the way before I did. If I’d have been in the class, I’d have been part of the 10%. Woo Hoo!
Ok crap, my code fails if the array has a length of 1. Need to use
instead of
as the while condition. FAIL.
Ugly and inelegant, but it works. I hope.
Seems to, anyways.
Running:
Success! I wrote a recursive version first:
struct array_elem
{
int key;
char *value;
};
char *bin_search(struct array_elem *array, int count, int key)
{
if (count==0)
{
return NULL;
}
else
{
int split_ix = count/2;
int split_value = array[split_ix].key;
if (split_value \> key)
{
return bin_search(array, split_ix, key);
}
else if (split_value \ key)
{
count = split_ix;
}
else if (split_value \< key)
{
array += split_ix + 1;
count -= split_ix + 1;
}
else
{
return array[split_ix].value;
}
}
}
}
Both work on the handful of test cases I've tried, so there might be some bugs. I'm curious what the common mistakes are.
I take it back. I misses some values with the floor() calculation. Bummer.
I’m going to own up to a failure on the first attempt.
I started writing it after reading the Programming Pearls quote rather than later in the article, so I didn’t see the “no testing” admonition until later.
For posterity:
I got syntax errors on my first try, but after I cleaned them up the code passed my tests:
My code worked the first time, but my test had a bug. (When the haystack had duplicates, it would want bsearch to magically give back the correct one.)
I had no trouble implementing a correct solution, although sample arrays to test with probably would be nice
doh, meant C89 not C95
Here’s my attempt, with associated tests:
I had two bugs (so far). The first was that I got the middle comparison around the wrong way, and the second was that I forgot to add 1 to the final result. I might have a go at an iterative version later on…
I failed. Though I ended up with a correct implementation, I went and coded it before I saw the “no testing” rule, and my code was not correct for my first test.
Here’s my Haskell solution, at least a bit different than others posted so far:
So, here’s my 1 hour JS tryout. It’s not tested so hopefully I nailed it!
I used Delphi
Seems my last comment was truncated, here a new try:
I failed. Programming is hard, let’s go shopping!
Finally I protected my ‘<‘ by HTML code…
In ruby, in the spirit of Bently’s code description:
I had to mentally test this before getting the upper and lower range calculation correct.
@Kaleberg: “An old working binary search I had lying around made an excellent starting point.” – agree, usually I find this to be a superb strategy!
An iterative version. I had a bit more trouble with this one, it went into a spin and I had to limit the number of iterations:
I had a number of bugs in this one, which took a while to work out. The first was that I (again!) had my comparison around the wrong way, and the second is that I’m not sure that there’s a clean way to prevent my code going into an infinite loop for non-existent values. I have an off-by-one error in there, perhaps?
Yep, so I’m a terrible programmer – I’ll go hang my head in shame now… but read up on binary sort :)
Many commenters have said that this challenge is meaningless, since there is no real-world application of coding without testing. I disagree.
First, I have found that practicing coding on paper forced me to learn better habits about correctness, which results in higher coding speed. So [if you’re like me], doing drills of this non-interactive nature will improve your productivity.
Second, frankly, if you can’t manage this elementary a case (without tests), how can you trust yourself to come up with all the relevant tests? Consider those who posted their allegedly-tested programs and were incorrect. There but for the grace of formal proof methods go I.
Oh, and I wrote it in 10 or 15 minutes, recursively in Python, and believe it to be error-free. Let’s see if I can post it successfully:
Success. I just want to point out that this is a really terrible problem. Given the rules, it is not surprising that 90% fail, and it is not really a measure of programmer skill or productivity.
I could have written this, debugged it, and written a battery of unit tests in five minutes. Instead it took me about 25 minutes of being a human compiler/computer before I was confident enough to compile and run it. I used pen and paper to test it by hand and clean out the bugs. This is what I have a problem with; I could have just used the computer to do this for me. I found this problem incredibly frustrating because I was stripped of everything that I consider makes me a good programmer.
Here’s the test I ran against it in case anyone was wondering. Not terribly thorough, but I really don’t care.
All values in the output matched up with their position in the array. I also had no compiler errors or warnings in the implementation of bs() on first compile (as C99); I got an error first because gcc compiles in C89 by default (so it complained about the for loop declaration), and I got a warning because I forgot to include stdio.h (technically part of the test, since the algorithm does not need that header.)
I believe this is correct. No promises though, I’m a novice coder.
Language this month=Actionscript 3.
Did not work the first try, as I had mixed up my . (This is the fixed version.)
Woops, that was supposed to be: “Did not work the first try, as I had mixed up my less-than and greater-than signs”
Oh also, I should mention I tested mine on paper for (almost) all branches for array sizes zero through five (yes, this is why I am frustrated.) So I am confident that it is correct.
I still don’t think these rules are fair. For instance my code obviously fails integer overflow, as others have mentioned in their own code. Does this even matter? I wouldn’t expect an algorithm of bs() to guard against this in any way besides an assert(). I don’t consider this an error.
I find it silly that putting – instead of +, for example (as I did in (min + max) / 2, luckily I noticed it before compiling) is considered failure, which is something that would immediately be caught if we were allowed to test; and yet, people here are claiming success with recursive solutions in languages without tail-call optimization, so they have O(log n) space requirements (the algorithm should be O(1) memory.)
This can’t possibly be graded pass/fail. A question like this on a paper test makes sense if you can still get part marks for a decent attempt.
Anyway. /complaining
Failed on first attempt, forgot to exclude the middle element in the recursive call:
def bsearch(in_list, target):
if len(in_list) == 0:
return False
else:
middle_index = len(in_list) / 2
middle = in_list[middle_index]
if middle == target:
return True
elif middle LESSTHAN target:
return bsearch(in_list[middle_index + 1:], target)
else:
return bsearch(in_list[:middle_index], target)
Not tested yet. Thirty minutes. Revolution, written recursively, which glancing at some of the other solutions I’m not sure I prefer now. I wrote it to take a list rather than an array.
Not quite clean.
Got it right the first time.
There are a few reasons why this algorithm is tricky.
* If you use a lim-index instead of a max-index, you’re likely to access outside the buffer when searching for an element higher than the array max. Fencepost error.
* It’s easy to forget the case of searching for a non-existent element. In the case above, I’ve even provided a return value that tells you (unambiguously) where the insertion point would be for a non-existent element.
Five minutes, and Python, obviously:
def binary_search(collection, value):
“””returns the index”””
if not collection:
return None
if collection[0] == value:
return 0
if collection[0] > value:
return None
if collection[-1] == value:
return len(collection) -1
if collection[-1] < value:
return None
# Bisect:
mid = len(collection) // 2
if mid == 0:
# Done:
return 0
if collection[mid] == value:
return mid
if collection[mid] value:
return binary_search(collection[:mid], value)
Seems to work, although I have only made like three tests. :) I find the idea of a challenge where you aren’t allowed to actually run your code moronic. This isn’t testing for how good programmers are, it’s more about luck, IMO. I’d say the passage in the book is outdated, because now we have something called test-driven development. :)
Followup: seems to work. Tested:
1. target is in list
2. target is at start of list
3. target is at end of list
4. target is not in list
5. target is less than smallest value in list
6. target is greater than largest value in list
7. target is empty
8. list is empty
9. target and list are empty
sure not the fastest but kinda readable for me:
@dewb:
> Testing is part of software development. This challenge is a bit like asking artists to draw a perfect circle while blindfolded. It’s a neat parlor trick, but you can’t use it as a litmus test for a “real artist” or “real programmer.”
My daughter went to art school, and one of the exercises they used was to draw something without ever looking at the paper. So perhaps your analogy is more confirming than refuting.
How are you going to test all these!? I see just about every programming language imaginable here!
Just a small rewrite of my previous version (shortened the variables’ names to 1-2 letters to see how short can it get :D) + unit testing to ensure that it works as designed. Looks like it does:
Following the prototype of C99 bsearch():
, I wrote this:
http://pastebin.com/i8UGMkJi
Lots of room for improvements, but meh. I’m going to steal other people’s tests now and check it out.
This was actually the first time I ever implemented the binary search. And I actually think I got it right, too!
Source code in PHP follows. Wrote some basic tests too, but don’t want to waste space.
Untested Javascript. (A bit bold since I’m new to the language.)
Apparently Adobe have found similar results. Years ago Sean Parent posted that they used the test on interview candidates and saw high failure rates. The only article I can find now is http://stlab.adobe.com/wiki/images/8/8c/Boostcon_possible_future.pdf (warning – this document contains a binary search implementation).
Not tested:
My first try failed. I had to switch my first two if-statements around and now it (looks like it) works:
int bsearch(int* haystack, int left, int right, int needle) {
int middle = (right + left)/2;
if (haystack[middle] == needle) {
return middle;
}
if (left >= right) {
return -1;
}
if (haystack[middle] > needle) {
return bsearch(haystack, left, middle, needle);
}
if (haystack[middle] < needle) {
return bsearch(haystack, middle+1, right, needle);
}
}
Alright, it actually looks correct. Tried out of range values as well as valid ones, empty array as well as a huge one, and I confirmed the results with a debugger, just in case.
Now, here are some bugs that 20 minutes of thinking solved (without testing!):
* When computing the middle index, avoid additions. They might overflow for large arrays. I’ve learned that in the past, the HARD way.
* Testing for (!found) usually leads to infinite loops, so I ditched that approach quickly.
* I started writing a binary sort…then I decided to read the specifications! :-)
* Although this is a typical example of a recursive algorithm, I’m not comfortable enough with recursion (I know, shame on me…), so I decided to go iterative..
* Do not mix semantics. If you decide your binary_search() should return an index into the array, this will impact your implementation – using uints instead of plain ints for example, which may or may not lead to nasty bugs if you’re not careful.
Full disclosure:
You shouldn’t probably include me in the statistics, because I did test my code a good amount of times. But I only tried it because I was doing the binary search in Haskell, but I’ve only been playing with Haskell for a few days, and with limited time, so I am far from truly getting it.
That being said, after a few times, i got this:
Like I said before, I’ve only been playing with Haskell for a few days, but it seems to work. Probably not the best way to do it, and maybe I didn’t even exactly follow the binarySearch algorithm (I think I did…did I?), but thats what I came up with.
Hope you guys like it :D
int binsrch(int count, int *integers, int n)
{
int mid;
int *ptr = integers;
while (count > 0)
{
mid = count / 2;
if (ptr[mid] == n) {
return mid + (ptr - integers);
}
if (ptr[mid] < n) {
ptr = ptr + mid + 1;
count -= mid + 1;
}
else {
count = mid;
}
}
return -1;
}
@Kragen Javier Sitaker: nope it won’t, BUT it isn’t correct either:
That case fails. So I’m not in the 10%, shame on me! :)
Yup, overlooked a silly bug in mine, fail++
Python, not tested
Addendum: as a recent Compsci Grad I am contractually obligated to know nothing and fail consistently. Therefore all errors found here-in are intentional and are to be commended thoroughly .
Doh, fails when the search item is not in the container. To my defense, it’s 4:30 in the morning and I’m going to sleep now.
Note to self: if you get up at 4 in the morning to get a drink of water, check the time on your laptop and see that someone sent you a link to a blog post, ignore it until morning.
Plain C code, worked at first try though I did test it before submitting here, which took some time before I realized that while the binary search was working, my small test code was faulty.
5 minutes to write, another minute to fix syntax errors; untested
after reading the description further and encountering the part about “index overflow”, I was banging my head on the table as I fell in exactly that trap :/ for “integers” it does not matter, but for “chars”, it would
I just saw you have more than 400 replies to this entry, and you only posted last night!
Why do you think your blog has exploded in popularity? The catchy title, the uncooked fish (sorry I had to) or the witty writing?
I think the results of this experiment will come out differently than the 10% success rate mentioned. Telling people to only post anonymously so they can’t receive shame nor credit for their code would have been better.
With the amount of readers you have on your blog, you could conduct some pretty interesting experiments.
I have used the same prototype of bsearch. Didn’t test it.
After testing it looks good.
I tested with random values wich resulted in arrays with repeated values. That is something I didn’t have in mind when wrining the code. It turns out my function returns the first of the repeated values, which is nice. Lucky me :)
The code (again) and the random test:
Written in Redcode. :-) Worked first time but I did a couple of test cases on paper while I was coding. Is that cheating?
Compiled, but not tested.
Did it in about 30 minutes while watching TV. As far as I know, it’s bug-free.
forgot to put the source-meta stuff in, now ok. I tested meanwhile and know I belong to the 90%.
My implementation is here:
http://www.lucabacchi.it/blog/post/view/75004
I thought I’d give it a crack in C#.
Disclaimer 1: I’m pretty good in C and C++ but I have no serious experience with C#.
Disclaimer 2: I’ve walked myself through this code in my head but I haven’t tested it.
The BinarySearch() function returns the index of the item if found, otherwise -1.
I failed. My version ends up in an infinite loop if the value I’m searching for is larger than all values in the array, or if there are only two values in the array and the searched-for value is between those two.
Having tested my code, it seems to be correct as far as I can tell. The only thing I missed is the possibility that the array reference passed can be null. I’d treat that as a precondition, so I’d insert the following line at the start of the function:
no check…
I’m gonna test it after posting.
I’m pretty scared of the result though.
{source}
def binary_search(x,v):
def aux(a,b):
if a==b:
return a if v[a]==x else None
else:
p = (a+b+1)/2
if v[p]>x:
return aux(a,p)
else:
return aux(p,b)
return aux(0,len(v))
{/source}
I’m kind of a n00b who only knows how to “paste libraries together” in C#, so it was fun to do this exercise and feel like a real programmer :-D
I had two errors in my original attempt, both of which were easy to correct when I ran a simple test.
Here is the corrected code, with comments about my original errors.
It is probably not a very elegant implementation, but it seems to work.
Well, I first written it (10-20 minutes) and then read till end where you say “do not test”.
I know myself, I make a lot of mistakes. So I always write software with help of testing and debugging.
For this task, testing was enough, I did not use debugger.
Warning! Super-ugly c# code below (written it in linqpad)
var testArr = new int []{1,2,2,3,3,3,3,5};
var found=false;
var findVal=5;
var endPos=testArr.Length;
var startPos=0;
int center=-1;
if (testArr.Length >0)
do {
var prevCenter=center;
center=startPos+(endPos-startPos)/2;
var curVal=testArr[center];
if (curVal==findVal) {
break;
}
if (prevCenter==center){
center=-1;
break;
}
if (curVal findVal){
endPos=center;
}
}
while (!found);
(center).Dump();
Scheme version, did no changes after testing it.
(define (bsearch item vec)
(define (iter start end)
(if (> start end)
-1 ; search failed
(let* ((middle (quotient (+ start end) 2))
(mid-val (vector-ref vec middle)))
(cond ((= item mid-val) middle)
((< item mid-val) (iter start (- middle 1)))
(else (iter (+ middle 1) end))))))
(iter 0 (- (vector-length vec) 1)))
As far as I can see, this seems to work. Written according to the rules.
I tested the code I posted earlier. Searching an array of all the numbers between 0 and 100 million that are divisible by 3 only takes a couple of seconds and gives the correct result.
If I increase that number to 1 billion then I get memory overflow. This could of course be fixed by making it non-recursive but I’m too lazy for that :-)
Just for kicks, I ran through the first 100 comments or so and tallied the language choice:
Python 40
C/C++ 36
Unknown/pseudocode 6
Lisp/Clojure/Scheme 5
PHP 4
Three each: Java, Perl, C#, JavaScript
Haskell 2
One each: VB, Delphi, Smalltalk, FORTRAN, Lua, Objective-C, ColdFusion
Conclusion: Almost everyone (who cares about implementing binary search, anyway) uses C/C++ or Python.
BTW, Google had me do this in my phone interview (reading code over the phone = not fun). I made the same off-by-one error as several commenters (< instead of <=) but caught it a few minutes later while we were still on the phone.
I *think* my first version works:
Failed… just. :-)
Firstly, I coded an inelegant recursive solution, and only when I started writing the test cases I realised I forgot the case of an empty array and a null array.
For valid sorted arrays, however, it worked just fine, including searching for an item not in the array (returning -1).
I made three mistakes.Here is my orginal code.
{source}
public class A{
public static void main(){
search(7);
search(40);
}
public boolean search(final int n){
int[] a = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
int middle;
int left=0;
int right=a.length-1;
for(;leftmiddle){
left = middleindex+1;
}else {
right= middleindex-1;
}
}
System.out.println(“Search “+n+”but result not found “);
return false;
}
}
{source}
As you can see,main() function didn’t take paraments; forgot to make search() function static; and middleindex should be (right+left)/2, that was a clerical error.
Please let me know if any other bug exists.
I literally jumped in writing that code right when I read “try it” :-D
I ended up with this and it seems to work
basically it quite mimics array_search() but without implementing it’s $strict parameter
damn it! again:
{source}
public class A{
public static void main(){
search(7);
search(40);
}
public boolean search(final int n){
int[] a = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
int middle;
int left=0;
int right=a.length-1;
for(;leftmiddle){
left = middleindex+1;
}else {
right= middleindex-1;
}
}
System.out.println(“Search “+n+”but result not found “);
return false;
}
}
{/source}
this one works fine; second try though, elisp by the way
Python, success, I definitely thought about it a lot harder when I couldn’t test.
def binary_search(array, elem):
l, r = 0, len(array)-1
while True:
m = l + ((r-l)//2)
if array[m] == elem:
return True
elif r == l:
return False
elif array[m] > elem:
r = m
else:
l = m + 1
Pingback: 你是那10%可以写出二分查找的程序员吗? » 为之漫笔
I can’t resist posing the evil question: if your array can be in size up to Integer.MAX_VALUE (for Java, but applies elsewhere as well), are you sure your code will not create integer overflows in your arithmetic? The code is only correct if it works for any arbitrary array being passed in…
@Martin Probst:
You are obviously right. Count me in with the 90% then…
but @Avi:
I think your correct solution is unnecessarily complicated; someone else further up wrote something like this:
middle = first + ( (last – first) / 2);
This should also avoid the problem of integer overflows, shouldn’t it?
@Martin Probst: at least mine shouldn’t
@EoghanM: shouldn’t you return the position instead of just “yeah, found it in there”? :-)
whoops, fell into the wordpress markup trap! Once again…
My code was correct.
Here’s mine (recursive in Objective-C): http://gist.github.com/372377
Time taken: about 20 minutes.
I haven’t found any errors in my casual testing.
Oh and here’s how you might test my earlier C++ binsearch:
I also posted a little bit about the methodology here: http://www.reddit.com/r/programming/comments/bt7nh/according_to_jon_bentleys_book_programming_pearls/c0oh6po
This methodology comes from (or rather, where I learned it from is) the C++ standard algorithms library, which is well worth examining because it has stellar design notwithstanding the ugliness of the language itself. The basic principles are applicable in most languages.
Tested it, worked well for finding values but blew up trying to find value 200.
I failed, with a simple and easily found error, but I did it when I read “try it” in the quoted section, not your later challenge. So I assumed that testing was allowed, and stopped to do so early. Trying to write perfect code straight out of the blue seems pretty pointless. Some people even say you should write thorough tests first and then code until they pass. The scope of this example is about as big as you can get and hope to have any success just doing it all in your brain, so what’s the point?
Works for my test cases first try. I know it’s a bit ugly but no one said write elegant or pretty code.
I failed. That is all.
First pass, buggy. Tried to be smart and got the sort condition backwards. Added one comprehensive test, fixed the bug. search(array, e)==i for all e=array[i] in a random, sorted array.
That said, coding without tests is just trouble waiting to happen. Doing so is part of my standard work routine, and explicitly specifying “no tests” pushes this firmly into the realm of theoretical uselessness, even for a random exercise.
9 minutes
use warnings;
use strict;
use POSIX qw(floor);
use Test::More qw(no_plan);
sub bsearch {
my ($list, $target) = @_;
return unless @$list;
my $middle_index = floor($#$list / 2);
my $middle_val = $list->[$middle_index];
if ($middle_val == $target) {
return $middle_val;
}
elsif ($middle_val < $target) {
return bsearch([@$list[$middle_index + 1 .. $#$list]], $target)
}
else {
return bsearch([@$list[0 .. $middle_index – 1]], $target)
}
}
for (1..10) {
ok(bsearch([1 .. 10], $_))
}
ok(!bsearch([1 .. 10], 0));
ok(!bsearch([1 .. 10], 11));
on my code, i wrote it to only work for integers, it could be modified to work for floats too
also, i got it on probably my 4th or 5th run-through when debugging
I didn’t get it quite right on the first go because of a stupid mistake – in the elif/else brackets, I’d originally just written “binsearch(…)” instead of “return binsearch(…)”. Other than that it appears to work.
Wow, from a quick audit it seems that mis-handling empty arrays is by far the most common mistake. Eg from a sampling of the solutions stated to be correct, I found it in 8 solutions (@finsprings, @Aaron, @Mike J, @Langtree, @Luke, @Marcus, @cycojesus, @Erik Swanson), with only 1 having a different mistake (@donaq, which was caught later).
Success first time with Python.
def binSearch( array, item ):
low = 0
high = len(array)
while (high>low):
mp = (high+low)/2
if (array[mp].key == item): break
if (array[mp].key low):
return array[mp]
return None
Great work, Eric! I thought we might see a bit more of people pointing out bugs in each others’ code, but it’s nice to see that you weren’t satisified with destruction-testing just one solution :-)
And, hey, the rest of you — you’re not going to take this lying down, are you? :-) Can someone find a bug in Eric’s submission? It’s at
reprog.wordpress.com/2010/04/19/are-you-one-of-the-10-percent/#comment-1869
Ok – that code didn’t paste correctly – lost indentation and lost middle part of text.
{source}
/**
* if success,return the index of [toFind] in [_data]
* else return -1.
* @param _data source data
* @param low start index of source data
* @param high end index of source data
* @param toFind number to find
* @return
*/
public static int search(int[] _data, int low, int high, int toFind){
if(_data == null){
return -1;
}
if(low > high){
return -1;
}
if(low >= _data.length || high >= _data.length){
return -1;
}
int mid = (low + high)/2;
if(_data[mid] == toFind){
return mid;
}else if(_data[mid] > toFind){
mid = mid – 1;
return search(_data, low, mid, toFind);
}else{
mid = mid + 1;
return search(_data, mid, high, toFind);
}
}
{source}
Only work for integers.
Write by using JAVA.
the potential bug may be found.
I did this in ruby, and my first code was not correct. I had an infinite loop because I was not shrinking the size of my window.
That said, the entire thing was coded, tested, corrected in less than 10 minutes, so I guess this is also a repeat of the “why is it important to be right first, as long as you are right at the end”. Testing is important for code.
python:
def bsearch(needle, haystack) :
idx = len(haystack) / 2
if idx == 0 :
return -1
if needle == haystack[idx] :
return idx
if needle haystack[idx] :
return bsearch(needle, haystack[idx+1:]
success! i think. it may lose in an edge case (like haystack of length 0) but that’s outside the scope of the algorithm and more an exercise in defensive programming IMO.
Coded and pasted here with no testing.
Third branch of the if compares in the wrong direction, so if the target is in the first half of the array, it would narrow to the second half. Whoops.
I felt my heart sink when it threw an exception on first run, but it turned out the error was forgetting to sort the test arrays in the test code, so all was ok :)
After I wrote the code in an editor, I did do about 4 example runs-through (algorithm in my head; pen and paper for variable store) which was how i remembered to add 1 to mid for the slice and return in the RHS block.
I’m not sure if that counts as cheating, but I’m pretty sure it doesn’t.
10mins. and maybe 20-30 more for the test code.
Failed. (Python)
Failed on boundary case! (Only gave myself 15 minutes, but still…)
Mine did not run the first time I ran it. Syntax error. Fixing the syntax error, I popped the stack.
Crap.
I’m way late to the game, but if you are indeed tallying the results, add me to the list. I assumed arguments would not be null. Up to you whether you consider that a bug or not for the purposes of this exercise.
Recursive solution written in C#, in about an hour:
Ryan,
It is astounding, but yours is the 500th comment on this entry!
It seems obvious to me from the problem description that you can assume the array reference is not null (i.e. there is actually an array!) but that zero is a perfectly good number of items for that array to contain. So programs that fail when passed a null Array are fine; programs that go crazy when asked to search and empty array fail the test, and diminish, and go into the west.
Some of you guys can write such compact code!! I wrote mine before reading the ‘no testing’ rule so here goes:
Mike,
Congrats on the new comment record :o)
You seem to have struck a nerve with a lot of people. I know for one that I’ve learned something from participating.
Thanks,
Ryan
I had an off-by-one error, a comparison flip, and I forgot to propagate the slice offset back up the call chain.
The last argument need to be zero when called by a user.
One more fix: the offset needs to be added to the pivot index in the recursive call.
@Jeff: Handling a haystack of length 0 certainly IS within the scope of this problem. An array with no elements is sorted (see: vacuous truth), and so is valid input. Also, it doesn’t appear that your code handles arrays of length 1 either (because len(haystack)/2 == 0 in that case).
Did it in *looks guilty* LabVIEW. Worked on first run. Fun post, thanks for the distraction!
@David Mihola: Yes, it probably is too complicated. But since “For the purposes of this exercise, the possibility of numeric overflow in index calculations can be ignored”, I didn’t worry about cleaning it up too much – I just copied what I originally wrote.
my cheat: I did this last week to demonstrate the principle to a friend. No copy+paste, though. All from my head. This is probably not the fastest way to do it.
def bsrch( d, target ) :
if len(d) is 1 and d[0] != target:
return None
print(“Searching “, d, ” for “, target)
midpoint = len(d)/2
value = d[midpoint]
if target == value:
return midpoint
else:
if value > target:
rval = bsrch( d[:midpoint], target )
if rval != None:
return rval
else:
rval = bsrch( d[midpoint:], target )
if rval != None:
return rval + midpoint
Untested and took about 15 minutes.
{source}public static int search(List sortedHaystack, String needle) {
int startRange = 0;
int endRange = sortedHaystack.size() – 1;
while (endRange – startRange >= 0) {
int midRange = startRange + ((endRange – startRange) / 2);
String value = sortedHaystack.get(midRange);
int comparison = value.compareTo(needle);
if (comparison == 0) {
return midRange;
} else if (comparison < 0) {
startRange = midRange + 1;
} else {
endRange = midRange – 1;
}
}
throw new IllegalStateException("Unable to find needle.");
}
{/source}
Maybe 8 mins in PHP and running it in my head. Not proud of it in parts, and still haven’t tested it. My ego is afraid to…
but I should read the instructions, sorry for the html mess:
argh, right after posting it, I can see I pasted the wrong one. Middle should be:
He, not even interested in trying – it’s a sheer bullshit metric. I doubt that my mechanic is in the 0.1% that can take my tires off without a wrench, either, just using his bare hands.
;)
Ok, well, I haven’t been able to actually test this and it’s been a while since I’ve written C but here we go:
http://pastebin.org/160265
So, that was done in Notepad++ and I’m yet to compile it. Fingers crossed :)
Now to look at some of the answers :)
Andrew
@Avi: I see. And just in case: I didn’t mean to sound offensive – my intention was more to check if I was right than to tell you that your solution could be better – at least yours was correct whereas mine wasn’t…
Here is my implementation in Perl. I tested it with a simple lexically-sorted array (‘a’..’z’) and it seems to work. It will probably fail horribly with numerically-sorted arrays.
Here is an example of how to call it:
Testing didn’t reveal any bugs, if anyone finds a bug I’d be most interested in hearing it…
Decided to have another look….FAIL. Mine dies if the search is larger than the largest item in the array.
We should go back to the essence of this blog post. I think Bentley’s point was that after explaining the algorithm in english it sounds easy to write. Easy enough that any real programmer could probably spit out a working implementation without actually trying it. His point is that this is deceptive; the code is trickier than the explanation.
Perhaps because we’re plugging other peoples’ libraries together so often, we’re used to black boxes. I think that’s why so many people are appalled by the idea of not testing. Testing is quicker and easier than reading the source of the libraries you’re using.
@acastle: In your recurse I think it should be RETURN bsearch(…, but other than that your corrected version looks ok.
@Tristain: You are using max to be exclusive, correct? Then in the recurse when value < range[mid], you should not use mid-1 as the new max or you skip the index before mid as well. E.g. try adding a test case to search for 5 in your array.
Manually tested with a few values and lists.
Untested but it should work
I actually ended up doing a little testing before I posted it. Only error I found was forgetting that python list slices are [s,e) and not [s,e].
>>> 2 // 4
0
>>> 5 // 2
2
>>> 2 // 2
1
>>> 4 // 2
2
>>> 3 // 2
1
>>> my_list = [i for i in range(100)]
>>>
>>> def find_index_of_i(my_list, i):
… s = 0
… e = len(my_list) – 1
… while True:
… if len(my_list[s:e]) <= 2:
… if my_list[s] == i: return s
… elif my_list[e] == i: return e
… else: return -1
… middle = s + (e – s) // 2
… if my_list[middle] == i: return middle
… elif my_list[middle] i: e = middle
…
>>> find_index_of_i(5)
Traceback (most recent call last):
File “”, line 1, in
TypeError: find_index_of_i() takes exactly 2 arguments (1 given)
>>> find_index_of_i(imy_list, 5)
Traceback (most recent call last):
File “”, line 1, in
NameError: name ‘imy_list’ is not defined
>>> find_index_of_i(my_list, 5)
-1
>>> 99 // 2
49
>>> 49 // 2
24
>>> 3 // 2
1
>>> my_list = [i for i in range(100)]
>>>
>>> def find_index_of_i(my_list, i):
… s = 0
… e = len(my_list) – 1
… while True:
… if len(my_list[s:e + 1]) <= 2:
… if my_list[s] == i: return s
… elif my_list[e] == i: return e
… else: return -1
… middle = s + (e – s) // 2
… if my_list[middle] == i: return middle
… elif my_list[middle] i: e = middle
…
>>> find_index_of_i(my_list, 5)
5
>>> find_index_of_i(my_list, 10)
10
>>> find_index_of_i(my_list, 20)
20
>>> find_index_of_i(my_list, 21)
21
>>> [0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5]
>>> li = [0, 1, 2, 3, 4, 5]
>>> li[0]
0
>>> find_index_of_i(my_list, 0)
0
>>> find_index_of_i(my_list, -30)
-1
As I said, I’m confident that my solution works, i hope that I’m correct.
Still it’s fair to note that i almost forgot a couple of details, for example, i almost left line 18 as int pivot = (upperIndex – lowerIndex) / 2; forgetting to add up lowerIndex again to the pivot, and line 22 was a nearly miss to, i almost forgot to check for the pivot becoming greater than the upperIndex when i add 1 to it on line 24.
As they say, the devil is on the details :p
Wait, “NO TESTING until after you’ve decided your program is correct.” ?
I’m a Lisp programmer. There’s no such thing as “not testing”, since I test every little bit in the REPL as I go.
Here’s a test harness for a Javascript function following the same contract as mine (which passed):
So far, my implementation has passed all the tests (surprisingly!). I did this in about 8 minutes and to be honest I didn’t even consider some cases (e.g. empty arrays) but so far every test I did has passed. Also my implementation does not add +1 or -1 to the middle index, does anyone care to tell me if my implementation is “good”?
BTW it’s a bit boggling to see the solutions doing array slices. They’re O(n).
Pingback: links for 2010-04-20 « that dismal science
André: try binary_search({0}, 0).
@Eric Good Call!
Nice catch Darius. This test condition is wrong
Wrote it in Java, recursively. And got the recursion wrong the first time – but that’s what unit tests are for :)
This is one of the reasons that I favour use of ready-made libraries: Trying to write an own sorting/searching/whatnot algorithm is a beginner’s error (in and by it self). There are well-tested and highly performant libraries for such basic functions available in any established high-level language—only rarely is the small gains possible by a custom made solution enough to justify writing an algorith from scratch.
Even if such justification can be found (possibly writing the first such library for a newer language) the correct way is obviously to grab a good book on the topic, pick a suitable implementation, and (if needed) translate/modify it.
More generally, in today’s software development, the main task at hand is limiting and controling complexity of various kinds. One way this is done is re-use: A less optimized solution with re-use is usually preferable to an optimized special-purpose solution. (Notwithstanding that some libraries and frameworks are too bloated to always be acceptable.) We can hope for the return of small Unix tools that each individually are of sufficiently low complexity that such measures are not needed; however, in reality, we are stuck with ever growing applications with ever increasing complexity—and, in today’s world, the good developer is not the one who understands complex code and algorithms, but the one who avoids or manages them. A sad, but near inevitable development.
(I realize that this was likely not the point you were making.)
Aww, what the heck …
So are these the bugs?
A is empty.
Can’t find T in certain places or certain sizes of A (even? odd? powers of 2?)
Goes into an infinite loop sometimes. (How do you test for this?)
Becomes a linear search sometimes (Did you test for this?)
Tries to access off the end.
Doesn’t return “not found” sometimes when T isn’t in A.
T is less than the first or greater than the last element.
Does (a+b)/2, which overflows with ints. (Did anyone here find this one by testing?)*
*The last is a problem if you have 2^31 one-byte elements to search (it could happen!). Or would be if you were searching a function or disk file rather than an array. Or if you used 32-bit ints (not size_t) to index arrays in a 64-bit computer, which I think would be the more basic problem. Or, if you’re in Python 2.x not 3.x, overflow bumps you into long ints, and then your function returns a long int which could have cascading inefficiency effects :-(
LOL @ Vince’s assembly version. Short, too!
But, come on people — does no-one write COBOL any more? Or APL? Or ETA?
Oops, forgot:
Has a problem with duplicate entries.
In Redcode :-)
@vince: It doesn’t look like you’re handling 0-length arrays correctly? That is, if ecx == [ebx] when edi==0, 0 will be returned instead of -1. Assuming [ebx] doesn’t cause an exception in that case, that is.
Now ETA would be fun. But Mike, the task is surely yours….
I wrote the code in eclipse because it was already opened.
I did not get it right in the first try, I executed the code once, found a bug, and fixed, so this is the second try and it works fine :D
Just found a bug in the code I posted, the code cannot find the last element of the array, to fix, just remove the “-1” from the main, and pass the array length there instead of the last position.
let rec binSearch target sortedArray =
match sortedArray with
| [||] -> None
| _ ->
let maidenIndex : int = sortedArray.Length / 2
if target sortedArray.[maidenIndex] then binSearch target sortedArray.[(maidenIndex + 1)..(sortedArray.Length – 1)]
else Some(sortedArray.[maidenIndex])
Wrote it in about 20 minutes, then tested it and it works great even when the element isn’t in the list.
Here’s my best shot in Python. I just wrote it by hand, so I might have messed up the syntax a bit (although it looks ok).
search(val, data):
L = len(data)
mid = L / 2
mid_val = data[mid]
if val == mid_val:
return mid
if L <= 1:
return None
if val mid_val:
search(val, data[mid + 1 : L])
@eric: You’re right of course. the code doesn’t check for zero length arrays. [ebx] itself can’t cause an exception because ebx is a valid address on the heap, but nonetheless a bug’s a bug, and so take my seat among the 90%…
Didn’t work! And I have remembered gist: http://gist.github.com/373070
Ah, I see. The WordPress bug. Sorry for being notorious. If now it won’t paste properly then I’m done!
{source}
import util.Random
val toFind = args(0).toInt
val r = new Random()
val a = (for (i<- 1 to 100) yield r.nextInt(100)).toList.sort(_<_).toArray
def binsearch(x:Int, a:Array[Int],first:Int, last:Int):Int = [
if (last-first=x) return binsearch(x, a, first, mid)
else return binsearch(x, a, mid, last)
]
]
for (el<-0 to 99) println(el+":"+a(el))
val res = binsearch(toFind, a, 0,99)
println ("res="+res)
{/source}
I meant posting some code here didn’t work! But my code worked! :-) So I’v decided to make a gist! (I was thinking along my previous entries! OOps!)
Let’s try once more. A post somewhere up above says to use {source}…{/source} but with square brackets. (Do you think he meant angle brackets?
Arghhh I surrender ! Is it because I use FireFox/Ubuntu? Anyway, in my blogspot blog it got pasted properly: http://anaivecoder.blogspot.com/2010/04/binary-search-script.html :)
Well, I thought I had it, I really did. It seemed right until I hit an edge case, an array of 2 items with the first of the 2 equal to the search value.
So I guess that puts me in the fail category as well. But here it is after I debugged (mentally walked through the code line by line to see where it was missing the value). Really, I should have thought of the edge case before I tested. :-/
Here was my initial attempt in ruby:
def binary_search(to_find, sorted_array)
return false if sorted_array.empty?
middle_idx = sorted_array.size / 2
middle = sorted_array[middle_idx]
return middle if middle == to_find
return binary_search(to_find, sorted_array[0..middle_idx]) if to_find middle
end
For the keen eyed, this does contain a mistake that I discovered in my quick testing attempts.
My subsequent, corrected version is as follows:
def binary_search(to_find, sorted_array)
return false if sorted_array.empty?
middle_idx = sorted_array.size / 2
middle = sorted_array[middle_idx]
return middle if middle == to_find
return binary_search(to_find, sorted_array[0…middle_idx]) if to_find middle
end
Points for those who spot the differences ;) (There are two :) )
Mike,
You wrote: “JEIhrig ,please check once more the instructions at the bottom of the post on how to post code in a WordPress comment. If you don’t do this, then . Repost your comment using the correct form, and I will delete the old broken version.”
I already read that but must have missed the instruction to replace the curly brackets with square ones. It wasn’t just the ‘greater than’ symbol that was missing but commented code somehow became uncommented, and another section was altogether missing, which lead me to believe it wasn’t mis-interpreted by wordpress but actually changed by someone. If it posts correctly this time, hopefully all my previous posts will be deleted.
After re-pasting my code for the third time, (Still not compiling) I noticed a problem. It returns ‘key’ whether or not it’s in the list. So I changed that. (Does it still count if I made a modification after initial submission though I never compiled?)
JEIhrig, I have deleted your previous posts now that this one is up, and properly formatted. Yes, I think it’s fine to change your submission so long as you make your change by inspection rather than only realising after tests have shown you the problem. Let us know how the tests go!
failed.
I tried it. But I couldn’t help testing — not confident enough in my code to do that! And sure enough I got it wrong. Several times. Having test cases help.
And, it’s one thing to have test cases. It’s another to have test cases that a) cover all cases, and b) are themselves correct!
Looking forward to reading your summary :)
Since I didn’t obey the rules, my submission doesn’t count… but posted anyways.
You know, not being allowed to test while writing code really screws with my TDD brain. Anyway, here’s a proposed solution. Yes, it’s iterative.
Note that this code is in fact also in the never-run class. It checks out as syntactically correct, but I am braced to be horribly embarrassed by what happens once I run it.
Huzzah. It passes the test suite. If there’s a bug, it’s subtle enough that I didn’t think to test for it.
D’A
Not tested, written in 10-15 mins :
(pseudo-code)
After reading what other did in the comments, I realized that I wrote :
int mid = min + max / 2;
Instead of :
int mid = min + (max-min) / 2;
So I guess I failed the test…
Not tested:
Pingback: Top Posts — WordPress.com
Eric Burnett: “@acastle: In your recurse I think it should be RETURN bsearch(…, but other than that your corrected version looks ok.”
LOL, and that’s all it takes to fall into the 90%. One missed keyword and my function always returns -1, unless the item happens to be in the middle of the array on the initial call.
Funny how reading through it, my mind interprets that function call as implicitly returning.
I went with a simple recursion in Java. I found some bumps along the road while testing, but haven’t found the need to alter the search at all after I finished and decided to start testing.
ps: actually, I made mistakes when testing, like forgeting I had a null return and printing out the array with the result as index, which obviously got a null pointer exception. Nothing on the bsearch itself, would like to see if I had any obvious mistake I still can’t see.
Pingback: Binary search redux (part 1) « The Reinvigorated Programmer
@JEIhrig: If key == list[mid], your code hits the else case and recurses with “return search(list, mid + 1, high, key);”, skipping that element and eventually returning false.
@Greg: Mike said to ignore integer overflow in this, so you don’t have to worry about that case. However, it doesn’t look like you handled the case of array being empty, however :(.
@Luke Stebbing: You do not handle the case where x is not in xs, nor when xs is empty. Essentially, there is no path for your code to return -1/False/None.
The second edition doesn’t change this passage.
seems to work, finds all the names in the list, and returns nil if the name wasn’t there
Seems like my last comment didn’t go through. Anyway I have to say this is a little hard to believe. Where do they find those 90% of “professional” programmers? BSearch is seriously not that difficult to write. Anyway here’s my untested version in python:
Here are my test code btw:
Success, took about 10 minutes. The Haskell code:
This code passed all my tests the first time, so I’ll tentatively claim success. Though if I’m proven wrong, that’ll mean both my code AND my tests were crap…
Worked first time, took 7 minutes to code in Java including unit testing.
I would not expect any developer to create a correctly working implementation first time, as there are too many ways to make easy mistakes – however, I would expect a good developer to be able to write, test, and complete a solid implementation in less than 30 minutes.
I’m relatively new to programming and this was a fun challenge. Reading up it looks like my PHP solution is pretty close to Fred’s.
In retrospect the code feels a bit too easy to trip up though – reading the comments I can see there’s quite a few cases I didn’t account for.
In sporadic testing of my previous entry over the last 24 hours, I haven’t found a single case that fails, including the tests kindly provided by Steve Witham. While this isn’t completely conclusive, it’s a good sign that I’m in the 10%.
A few notes:
Writing the code took about 6 minutes. I spent another 5 scrutinizing the code to make sure I didn’t miss anything. I did try to import the code into the shell, which uncovered a syntax error that I fixed before my initial post (this was allowed by the rules).
My solution didn’t include the workaround for overflow problems, but I’m not sure Python is susceptible to that particular issue. I know I’ve used the workaround in previous incarnations in the distant past; I don’t think I’ve worried about it since 16-bit days. It’s quite unlikely that I’ll feed it an array of greater than 2^30 in size, and the rules explicitly state that overflow can be ignored.
I’m also quite surprised by the submissions that used slicing to divide the input array, because it defeats the whole purpose of a binary search. Binary search is supposed to be a O(log n) algorithm, but slicing is an O(n) operation.
I can’t believe how few posters did this in Perl. My first pass was recursive (and worked) but for kicks I converted it to iterative (and forgot to save the old version).
@Eric Burnett: As I mention in bsearch()’s docstring, I’m interpreting the problem as “return the index where insertion preserves sorting”. IIRC, the original post didn’t state how edge cases should be reported, and I find that returning a valid index in all cases is much more useful than returning -1 or None. (I picked that up from the C++ STL convention of returning v.end() on a failure to match.)
Given an empty list, my code returns 0, indicating that [].insert(0, value) preserves sorting: [] is sorted, as is [value]. Given bsearch(3, [1, 2, 4]), by inspection I believe I properly return 2, which is the position where 3 should be inserted to maintain sorted order. (I don’t have a development environment set up on my iPad yet, so I can’t verify that. That’s also the reason I didn’t test my solution.)
After further analysis, I fail. Specifically, I will fail to find array(2) if array has size 6. Amusingly, I found the flaw by analyzing the code, and not by testing!
@Luke Stebbing: Hmm, fair point. I should have read your doc string more carefully – I retract my comments :).
shit, didn’t read the quoting instructions fully:
def bs(array,findme):
if(array):
array = sorted(array)
len = array.__len__()
pos=len/2
mid=len/2+1
while(mid):
#trunk because its an integer
if mid%2 and mid != 1: mid=mid/2+1
else: mid=mid/2
if (findme == array[pos]):
return array[pos]
elif (findme > array[pos]):
pos+=mid
elif (findme len-1 or pos < 0): break
return "None"
a = [1,5,3,4,7,8,9]
print bs(a,9)
a = [1,5,4,3,9]
print bs(a,-12)
print bs(a,3)
a = [-19,-3,34,5,9,12]
print bs(a,-32)
print bs(a,34)
print bs(a,324)
a = [-21,-3,14,42,54,2,43,1,5,7,2,-14,423,-3]
print bs(a,-21)
print bs(a,423)
print bs(a,12424)
print bs(a,7)
oh fack… i hate wordpress, well deduce the tabs hehe
Success. Or, erm, at least it passes the tests I tried.
Code:
Tests:
My solution found the correct value as far as I could determine with my test cases, but only returned a boolean. When I switched to use Steve Whithams test cases they expected a return value of position and steps I failed on the position (after converting my code to return those, of course). Not sure if that should count as a fail or sucess. Thanks to Steve for the test cases.
Here is the current code:
C, 15 minutes.
Luke Stebbing is right — I didn’t specify what the search should return when the element is not in the array. That was careless of me (although I suppose I could pass the buck to Jon Bentley :-)). Luke, I did mean for an out-of-band value such as -1 to be returned, as most (I think all) of the other solutions have done; but your approach is not prohibited by my statement of the problem, so you get a pass. (Provided there are no other bugs, of course!)
By the way, I love that Eric is continuing to fight the good fight. Keep up the destructive work!
Got it correct (I guess) after 10 iteration or so right here in Firebug.
{source}
function bs(arr, num) {
lower = 0;
upper = arr.length – 1;
while (upper != lower) {
mid = Math.floor((upper + lower)/2);
if (num == arr[mid]) return mid;
if (mid == lower) return num == arr[upper] ? upper : null;
if (num arr[mid]) lower = mid;
}
return null;
}
{/source}
{source}
private static int find(int t, int[] list) {
if (list.length == 0) {
return -1;
}
return find(t, list, 0, list.length-1);
}
private static int find(Integer t, int[] list, int lower, int upper) {
if (upper < lower) {
return -1;
}
int probe = (upper+lower)/2;
switch (t.compareTo(list[probe])) {
case 0:
return probe;
case -1:
return find(t, list, lower, probe-1);
case 1:
return find(t, list, probe+1, upper);
}
return -1;
}
{/source}
(Java) no idea if it works yet, I'm dimly aware that "upper+lower" might overflow for a truly gigantic array, but decided not to worry about that.
10 minutes, failed at testing for using
instead of
:(
Groovy version:
def binarySearch(List tab, Number query){
if (tab.size() == 0) return null;
def s = 0, e = tab.size() -1
while (s<e){
int m = s+e/2
if (query <= tab[m]) e = m else s = m+1
}
if (tab[s] != query) return null;
return s;
}
@Mike Taylor: What I like about returning a valid index in all cases is it allows you to avoid a conditional expression when you don’t need one, i.e. if you’re using bsearch() to build a sorted list. The possibility of a wild and crazy index like -1 forces you to guard every single call in all situations.
I’ve firmly believed the “Make sure special cases are truly special” mantra ever since I encountered overly specific definitions in ninth-grade geometry, and I think I’m following that rule here by treating a failure to find the element normally.
Here’s an O(1) way to determine whether my bsearch() found the value. I really should’ve returned (index, found) in the first place, since that would’ve been a more useful signature. (I also should’ve used Python’s a // b shorthand instead of int(a / b), but I simply forgot.)
Failed without testing……. yeah I forgot arrays dont always have to be an ‘even’ length ;). Once wrote a few tests, found ‘bug’ within minutes (seconds tbh).
Sorry, my first post seems to be screwed up. So I post again. I did not test the code before hand, and I did not follow the google blog link before writing this.
PHP, $range is array, $so is element to find
function binSearch($range, $so) {
$elems = count($range);
if ($elems == 0) return false;
if ($elems == 1) if ($range[0] == $so) return true; else return false;
$middle = $elems % 2 == 0 ? (($range[($elems/2)-1] + $range[($elems/2)]) / 2) : $range[floor($elems / 2)];
if ($middle == $so) return true;
$range = ($middle < $so) ? array_slice($range, ceil($elems/2)) : array_slice($range, 0, floor($elems/2));
return binSearch($range, $so);
}
— | maybe find a value a in a list of a. took about 20 minutes to be honest.
— admittedly i wrote it so it searched descending order lists 5,4,3,2,1 and realised
— this probably wasn’t desired, so changed “GT” to “LT”
find :: (Ord a) => a -> [a] -> Maybe a
find k [] = Nothing
find k xs = go (length xs) xs where
go len [x] | compare k x == EQ = Just x
| otherwise = Nothing
go len xs = go len’ xs’ where
xs’ = case last up `compare` k of
LT -> down
otherwise -> up
— i dunno if round 2.5 == 3
len’ | even len = len `div` 2
| otherwise = (len `div` 2) + 1
(up,down) = (take len’ xs,drop len’ xs)
seems to work alright
def binary_search(a,s, l=0, u=nil)
u = a.length-1 if u==nil
x = l+(u-l)/2
return x if s==a[x]
return “not found” if (u-l)a[x])
return binary_search(a,s,l,x) if (s<a[x])
end
Simple algorithm, takes an hour, but failed 3 stupid mistakes. Blame on me.
int bsearch1(int what, int elements, int* ar){
int max_el = elements – 1;
int min_el = 0;
//
for (;(min_el=0)&&(max_el<elements);){
int ix;
int delta;
//
if ( 0 != (max_el % 2)){
if (ar[max_el] == what)
return max_el;
— max_el;
};
if ( 0 != (min_el % 2)){
if (ar[min_el] == what)
return min_el;
++ min_el;
};
//
delta = (max_el – min_el)/ 2;
ix = min_el + delta;
//
if (ar[ix] what)
max_el = ix – 1;
else
return ix;
//
};
//
return -1;
};
It took me like five minutes, no errors at the first try:
{source}
def bsearch(ary, val):
lo = 0
hi = len(ary)
oldm = -1
while True:
m = int((lo+hi)/2)
if oldm == m:
return -1
if val ary[m]:
oldm = m
lo = m
else:
return m
{/source}
My guess is that most programmers (even experienced ones) do not mentally check for the obvious edge cases while they’re coding.
This has passed all the tests I’ve thrown at it since writing it. (Python)
Yay! I’m one of the best 10%! Below is my Java-source, I’ve made it generic instead of just ints.
Argh, layout is all fucked up. I guess I shouldn’t have used tabs… Anyway, clicking on ‘View Source’ helps a little…
@Mike: Thanks :). I should point out that I’m not checking 100% of them though… mis-formatted solutions I think code is missing from are skipped, as are languages I can’t wrap my head around yet (*cough* Haskell *cough*).
@Daniel: Just a nitpick – list[mid+1, len-1] is asking for len-1 elements starting at mid+1. Because there aren’t len-1 elements left you’ll get the ones to the end of the array (as desired), but it would be clearer to just say list[mid+1..len-1].
Also, as noted by other people, using array slices makes the algorithm O(n) rather than O(logn).
@Peter Toneby: What exactly you return is not specified, so not exactly matching someone else’s test cases is fine. I think if it passed in original form, that is enough. Your updated code looks good, although I will note that using array slices will make your search be O(n) not O(logn) as well.
@mondodello: It appears your code will mis-handle single element arrays (or ones that end up single element after recursion) since in that case, lowIndex == highIndex. If the first “if” is changed from greater-than-or-equal to just greater-than, I think your code would work.
@Piotr Gabryanczyk: “int m = s+e/2”: I think operator precedence will get you on this one.
failed to read instructions, tested before submitting, the only bug that uncovered was a ” which I’m sure I would have found by analysis.
{source}
int binarysearch(int *data, int sizeOfData, int lookFor)
{
int chopsize = sizeOfData / 2;
int index = chopsize;
// loop until the split size is 0 or value is found.
while( data[index] != lookFor && chopsize )
{
chopsize /= 2;
if (data[index] > lookFor)
index -= chopsize;
else
index += chopsize;
}
return (data[index]==lookFor)?index:-1;
}
{source}
Failed miserably
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
binsearch.rb
hosted with ❤ by GitHub
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
output.txt
hosted with ❤ by GitHub
I broke rule #6; I guess test driven development has made me soft. My mind hasn’t been trained outside an iterative development model. And since we’ve evolved past punch-cards, I’m okay with that :)
I spent 17 minutes for thinking. After that I spent 21 minutes for writing test fixtures (that I will run only after all code finished).
After that I spent 9 minutes on binary search function.
At all it takes for me 47 minutes.
After that I run all tests and all works fine with first run!
After all I readed comments and links wih typical problems – all of them was solved by me during thinking.
there is my code:
http://pastie.org/928040
Dmitry, you should win some kind of prize: you belt-and-braces solution of both thinking and testing is clearly the way to go. I an encouraged that, after you’d invested the time into design, it worked first as intended the first time.
…
Now we wait for Eric to break it :-)
Seems to work.