Monday’s little diatribe on git seemed to stir up quite a bit of strong opinion, both agreeing with me and disagreeing. As is often the case, they two camps seem to be split about 50-50, which makes me happy. I means I can be confident that I’m not talking complete arse-gravy, but I have a good chance of actually learning something.
For anyone who wasn’t around on Monday, the substance of my post was “git is bad because I don’t understand it”. Or to paint myself in a slightly less bad light, “git is bad for me because it makes assumptions about how I work that don’t match how I actually work”. Or, to summarise the summary, “git is the work of Sauron Gorthaur, the Abhorred, servant of Morgoth Bauglir, the Dark Lord that was called Melkor, destroyer and despiser, the corrupt Ainu and corrupter of Arda”.
I’ll admit that yesterday’s post was more a howl of anguish than a reasoned argument (although I still like the Harrier analogy). Having now calmed down a little, I thought it might be worth explaining myself a bit more, and addressing some of the comments, both here and at Hacker News.
Explain yourself, Taylor!
First, I was a bit shocked at the number of people (mostly at HN) who seemed to think that my whole problem with git is the need to specify -a when doing a git commit of all changed files. Folks, that was what is known as an example of how its model isn’t a good fit for how a lot of us work. There are many more of these — for example, the fact that if you run git tag and subsequently push your repo, the tag doesn’t get pushed.
Here is a more serious problem that I run into all the time (including once this very day):
- I make a one-line change to a single file.
- I commit my change.
- I git push, only to be told “! [rejected] master -> master (non-fast forward)”. (This is git’s cuddly way of telling you to do a pull first.)
- I git pull, only to be told “CONFLICT (content): Merge conflict in filename. Automatic merge failed; fix conflicts and then commit the result.”
So far, so good — someone else edited the same region of the same file as I did (among their other edits): of course its a conflict, there’s nothing git could do differently here but notify me and ask me to fix it. So I edit the file, fix the trivial conflict, and git commit filename.
Nuh-uh. ”fatal: cannot do a partial commit during a merge.”
Well, darn. So, OK, no problem, I already fixed the conflict, so now I’ll just git merge again to get it to make its world consistent, right? Nope: “fatal: You have not concluded your merge. (MERGE_HEAD exists)”. Well, duh! I was telling you to merge, you stupid git. You’re the branches-and-merges-are-easy version-control system around here.
All right, so I will just git pull again, and this time the merge will work OK. Gotta work, yes? No. ”You are in the middle of a conflicted merge.” Well I knew that! That’s why I am trying to resolve it. In fact, that’s why I have resolved it! All I am asking you to do is accept my resolution. Please? Is that so much to ask?
But wait — it’s worse than that! Not only can I not commit the file that had the conflict: I can’t commit any other file. My whole repo is stuffed until I satisfy the hungry god.
But wait — it’s worse than that! git status shows that there are many, many modified files even though I know full well that I only edited the one line of the one file. Because all the other changes that my colleague made have been splunged across my tree and suddenly, what the heck, they’re my responsibility!
The solution turns out to be that I have to use git commit -a, i.e. commit all my changes in one go. But, dammit, git, that’s not what I wanted to do! If I like to commit on a file-by-file basis, what business is it of yours to forbid me? And: much, much worse: my commit -a re-commits all the changes my buddy had already made and commited! Seriously, git: what the hell?
Something is rotten.
A handy household hint: how to abandon your changes when dealing with a conflicted merge
Of course, in the merge-conflict scenario above, you may sometimes see that your friend’s changes are correct and leave yours irrelevant, so that you just want to throw your own changes away and use the version you pulled. Should be pretty simple, huh? Well, according to the top-voted answer to this question on Stack Overflow, the correct thing to do is:
Talk about intuitive.
Here’s another one that I hate.
I needed to get back an older version of a binary file, foobar.doc, so I could compare it with the current version and see what had changed. (git diff is no use in this situation, because it works on text: I needed to get hold of the earlier revision so I could pull it into OpenOffice, which knows how to compare documents.)
The command that does this is git show, which writes the old version on standard output so you can redirect it into the file of your choice. In general, the command is git show revision:pathToFile. revision can be HEAD^ to mean “the one before the current one”. But when I did git show HEAD^:foobar.doc, I got back a more than usually incomprehensible error message.
fatal: ambiguous argument ‘HEAD^:foobar.doc’: unknown revision or path not in the working tree.
Use ‘–’ to separate paths from revisions
It turns out that this is because the file in question isn’t at the top level of the git module: when I said pathToFile earlier, I really meant it — you have to give the whole path relative to the root of the module. (The bit of the error message about using ‘–’ turns out to be complete red herring.) So I have to use git show HEAD^:dino/epub/foobar.doc, even though I am already in the directory dino/epub.
You can’t tell me that’s right.
What makes it much, much worse
I just know that someone — probably several someones — are going to reply to this article saying: “you are mistaken; git is correct.” These people, most of them kindly and gently, will talk me through my misconceptions about what a version is, what a commit is, how it affects the index, what a merge means, why it has to be this way and why I am sadly mistaken in thinking it should be otherwise. If we were discussing this in a pub rather than over the Internet, they would probably find a scrap of paper and draw a nice state-transition diagram for me, showing how the various change-sets propagate between the various checkouts, branches, indexes and repositories. Nine times out of the ten, this will be done with patience and tact, with a side of burning evangelistic fervour.
Here is my rebuttal:
I. Do. Not. Care.
This is what I meant last time about git not degrading gracefully. It’s great that it handles multiple local and remote branches and merges and all the other stuff, but you can’t Just Not Know about that stuff. You start out believing what you’re told, that you can just use clone, pull, add, commit and push, and ignore the other 139 git commands(*). But you can’t. You have to keep learning more of them, and learning new and baroque ways of invoking them; and, more importantly, learning more of the concepts. Any day now, I expect to learn that before git moves files into the index, it first keeps them in a top-secret pre-index stash-cache area.
(*) I am not exaggerating: /usr/local/git/libexec/git-core on my Mac contains 144 commands. /usr/lib/git-core on my Ubuntu box is less promiscuous: it contains only 138 commands.
Who is the user around here?
Is it terribly old-fashioned of me to believe that when a user uses a tool, he should be the one who determines how it’s used?
The bottom line for me with git is that I am sick of being pushed around. It swans about as though it owns the place. It make arbitrary demands. It tells me what to do. It’s as though ext2fs insisted on particular file-naming conventions, or vi mandated a specific indentation regime for your C code.
Unless of course …
Unless git is a hammer and I am trying to use it as a screwdriver. Or perhaps more appositely, it’s a bandsaw and I’m trying to use it as a bread-knife. Or indeed, it’s a Harrier and I’m trying to use it as a bicycle.
Which I suspect is the case, and why I think the move back to CVS/Subversion might be the way to go.
Some responses to comments
But git does all these cool things!
I know it; and All These Cool Things are of course the main reason I sideways-graded to it in the first place. In particular, the ability to commit (and do other things) when offline and not connected to the master repository is a huge win, and if I do grade back to CVS or Subversions I am really going to miss it.
So I’m not saying, or didn’t mean to say, that git doesn’t offer real advantages over CVS and its brethren. I’m just saying that these things come at a cost; a significant cost, that git advocates are in a bad habit of greatly downplaying. (Sometimes git advocates remind me of The Borg, or perhaps Moonies — they seem so earnest and so committed to what they’re doing, and so completely wired into their tribe’s way of thinking that they can seem unable to contemplate the possibility of any other way of thinking.)
But git has cheap branching
I know it; but I don’t want to branch. To read a lot of the tutorials it seems like git people branch all the time just for the fun of it (and therefore merge all the time, too). Sorry, don’t want to play that game. Cheap branching is better than expensive branching, sure, but that’s like saying influenza is better than cancer. I’d rather just not be ill at all, thanks.
Your mileage may vary, of course. The point is that git doesn’t give you the choice. Oh sure, it pretends it does (“just use clone, pull, add, commit and push!”), but the truth is that branches lurk everywhere — in tags, for example — and you simply can’t Just Not Use Them. That’s not a good model for how I want to work.
(Git Advocate: “That’s not a bug, it’s a feature!” Sorry, not interested.)
But git lets you use any unique prefix of a commit ID, so you can use 23bbcf84 instead of 23bbcf847889c1fbfbb368b27e7b4ef3648879b1
And yet, I am unmoved. Call me weird, but somehow I still prefer 1.8.
This should not need pointing out, but typing 40-character nonsense identifiers is only one of their many drawbacks. That I can abbreviate them to a prefix that might be unique enough mitigates the pain, but in no way eliminates it. Anyone can tell that 1.9 is later than 1.8. Who knows whether 745a4a4275c0322e5e699b02f8783b86ec14dc99 is later than bb7619b6568507d696516b1dd663b7b343d782f6?
There aren’t many sushi pictures in this article
Sorry, my bad. Here you go:
Try Mercurial instead, it’s similar but has a great tutorial!
I might just do that.
Try Fossil/Bazaar/Darcs/Arch instead!
Sorry, not gonna happen. The problem is that to try out a version control system, you have to trust a bunch of your code into it, and use the system to share that code across multiple computers. In the DVCS world, git seems to be the most popular by a long way; Mercurial has a biggish following so might be workable, perhaps as a staging post on the way to learning to love git, but I just don’t have the time or energy to spend in learning half a dozen different systems.
But wait, Taylor! ”git seems to be the most popular by a long way” — is that any criterion to use in choosing something? vi is more popular than Emacs; Windows is more popular than Linux; Britney Spears is more popular than Dar Williams; Big Brother is more popular than Veronica Mars. Yet you will never find me hacking with vi on a Windows box while listening to Britney with Big Brother on in the background. So why would I select a version-control system on that basis?
Simply because nine tenths of the point of version control is so that my colleagues and I can hack on the same code-base together. And that requires that we all use the same repository, which has to be under a single, jointly agreed, VCS. If I decide that Darcs is the answer for me, then I have to persuade all my friends to try the same experiment at the same time. Not gonna happen.
In fact, I think I’ve probably just persuaded myself not to use Mercurial, either. I might still read hginit.com, though.
Try reading Insert name of git tutorial here!
Thanks for the link. I may well do so.
You should read the Pro Git book
I probably will, thanks for the all the endorsements.
Git is not version control; it’s change control
I will ponder this. It seems profound.
(Although I notice that not all commenters seem agreed on whether it’s true.)
the biggest step is using git the way it’s intended: lots of branching, partial commits and rebase / merge to keep your downstream changes clean, etc.
Yeah, I know. That’s exactly what I’m trying to avoid.
The conclusion of the matter
One of the more thought-provoking comments on the last article was this one from teh:
I disagree with “Git’s just version control. I resent the idea of investing a month of evenings and weekends just to be able to check my freakin’ files in.”.
Version control is not “just” version control, it’s a first class tool for every programmer, up there with recursion and all that jazz. A programmers work is transforming code from one state to another. Git treats these transformations as first class objects, allowing you to rewrite or reorder them, have alternative transformation branches, send them around etc.
I still, frankly, resent the idea of spending the amount of time that I know will be necessary to become a git wizard. But I am increasingly reconciled to the idea that it will be time invested rather than time wasted.
I don’t intend to be graceful about this — I plan to mutter and groan and whine incessantly — but I have a horrible feeling the the outcome of this article and its predecessor is that I’m going to end up Deeply Learning git. I don’t want to — I hate the idea of ending up as one of the Git Advocates that I was complaining about earlier — but I think I’m going to have to. And if I do it, I’m going to do it properly, which means *sigh* another book, and probably another Long Overdue Serious Attempt At series.
As Xiong Chiamiov wrote in a comment:
I use it because the benefits outweigh all of the things that you mentioned.
Dammit all, he’s right, isn’t he?
Update 1 (a couple of hours later)
Very important point in a comment by Chris, which I should have made in the original article:
And at the risk of sounding absolutely trite, at the end of the day isn’t the important thing that we actually use some form of version/change/source control at all, implementation be damned? After all, isn’t that what separates us from the monkeys?
So, so, true. From 1990 to 2000, I used SCCS, a version-control system written in 1972 that was so amazingly primitive that it still thought digital watches were a pretty neat idea. It makes CVS look like the height of power and sophistication. I am here to tell you that difference between not using version control and using SCCS was like the difference between night and day; after that, everything else is trivial in comparison. Well, maybe not trivial, but the productivity gap from No Version Console to SCCS is much greater than the gap from SCCS to git.
Thanks for reminding me of that basic fact, Chris.
Update 2 (the next morning)
Many thanks to all of you who have commented (well, nearly all of you). I think I’ve seen more genuinely helpful insights in the comments here and on the previous article than in any of the various git tutorials I’ve read. In particular, I have learned the important concrete lesson than git commit -a is not my friend: an important lesson that should be taught to every git newcomer.
I don’t like to single out individual comments when so much useful stuff has been said (though I’ve also been called an idiot rather more times than I usually like), but for those of you who don’t usually read comments, please just take a look at this one from Kragen Javier Sitaker: apart from anything else, it contains by far the best justification I’ve ever seen for git rebase (or git lie, as I prefer to call it).