Previous Entry Share Next Entry
08:02 am, 7 Apr 08

on my continuing love for dvcs

I spent the weekend hacking on a project that involved a bunch of trying things out -- maybe this object needed to carry that field, or maybe it belonged over here, and each change required a cascade of changes throughout the code. I managed it all with local branches in git. I could write commit #1 with "this is the structural change I intended, but only the unit test compiles", commit #2 with "this other module was no good in the first place and should be another way", commit #3 with "restructure a third module to now use the new interface provided in #1 and 2". Then, when I decided it was no good, I could rewind the branch, start a new one, and transplant patches (like #2 above) from the dead branch and keep on. Right now I have eight branches in my local repo, and I've cleaned up some of the unused ones.

I was attracted to distributed version control initially because I really actually do a significant fraction of my programming offline: on planes, on the shuttle to work (which used to not have wifi). But since then I've come to see it in two new ways, the latter of which took me a while to settle into.

One (I think this is due to Ben) is that it's just locally cached version control: you can develop in the traditional centralized model if you like, where every commit immediately goes out to the server, but your local disk serves as a cache of data so any read-only commands are faster. (This was one of svn's innovations over CVS: they keep a copy of the tree in your .svn directory so "svn diff" is fast. The distributed model is the natural evolution of this.)

The other is that these systems are designed to manage your source code. Any time you're making a temporary backup of a file, or copying a directory into another place*, or making a patch file, or even holding off on checkpointing where you're at until you just get this one other thing working, your code management system is failing you. (I find I make liberal use of git rebase --interactive to squash crazy "this doesn't work yet" commits to make the tree make sense once I've decided the code is right.)

Last Friday on my project at work I ran into this sort of thing, like I'm perpetually running into it. I'm working on a kind of fundamental change to my project and, of course, one of the 120 high-level regression tests fails. In examining that code I find there's a bug in the testing code; the existing design masks the race condition. So now I'd like to fix that bug: what to do? I end up doing a second checkout and rebuilding (~30 minutes), fixing the bug there, rerunning all the tests to verify that bugfix didn't break anything. Then I manually create a patch file and apply it to my first tree. Now, after sending that bug fix for review, I learn that I should've done it a different way -- ok, fix tree #2, use patch -r to unpatch tree #1 and create a new patch to bring it back in ... yuck!

If I were more confident in my patch-management skills, you could imagine I could skip the second tree: I'd save my current work in a patch file, revert, fix the bug, test the fix, then reapply the patch. And at the code-review point I'd pop two patches off the stack to reshuffle them. (This is, in fact, exactly what Quilt manages.)

But this sort of problem has been solved already in a general way. What I should be able to do is: commit my work in progress in a local branch, rewind, write the bugfix, rebase on top of that. When the review comes, I can rewind and rebase again if necessary. And if I used two checkouts (see the footnote) I could just sync the branches between them, just as if I were making temporary commits server-side.

I still don't love git -- so clunky in so many ways! -- but using it makes me see how obviously useful this stuff is and how silly it will be if we're not all using it in a few years.

* Sometimes this makes sense in circumstances where you want to keep a separate copy of all the build output around. This is arguably the reason you want builddir != srcdir, though in my experience nobody cares about keeping that working.