March 23rd, 2007

  • evan

code review is great

At Google, all code must be reviewed before you can check it in. This means that another engineer needs to look over your change and give it their approval. Traditionally we've done it via email and some wrapper scripts, but more recently people have been switching to Guido's fancy tool.

When I first started doing this I was a bit skeptical. On many other projects I've worked on, you instead use post-commit emails (or equivalents) to watch what everyone is doing. You can contact the submitters if something went wrong, so it seemed to me to be more or less equivalent to review without the overhead.

But in practice, I find that after-the-fact emails let a lot more shady code to get through. It's surprisingly common (both in the code I write and the code by others I review) for something to be written in a slightly messy way -- like with an unclearly named variable, or with an overlong function that could be split into two -- and even though the author knows it's not ideal they'll still submit it. Then, in a post-commit email world, others will see the commit and not be happy with it, but they won't be unhappy enough to warrant requiring a subsequent change for cleanliness.

Instead, in a code-review world, nothing's been committed yet so it seems like it's not that much more trouble to just get it right in the first place. My most common sort of feedback in a review is a request to clarify a bit of code by either restructuring or with comments. Especially when you have a lot of people consuming the same code, this sort of polish is really quite nice and worth the pain of reviewing. I now fully endorse this sort of workflow.

One of the things that Mondrian (linked above) adds to the process is that each attempted commit is a sort of microbranch. Perforce is actually pretty weird as compared to all the other VCSes I've used, but you can sorta map the code review process to more "normal" systems and call sending code for review just sending someone a diff. In the past, when the submitter then updates their code in response to feedback, the reviewer just gets a second diff, and doesn't get any sort of diff-diff to see what changes were actually made between the diffs.

Mondrian adds this diff-diff feature by actually snapshotting the file state at different review points and (within its own UI) letting you see diffs between the files at different points in the review. In fact, what's it's done is implement its own sort of VCS outside of Perforce! It even identifies the different file states with hashes instead of changelist numbers (a Perforce-ism).

Identifying file states with hashes and micro-branches are the bread and butter of monotone, and in writing this post I could immediately sorta imagine how you could implement this sort of code-review process completely within monotone: you'd add an extra cert on revisions that had passed the review process, and then make it so all the normal commands (like "update") only pull revisions that passed review. And then I remembered that they had already done exactly this. (There are a lot of details necessary to making this not be totally painful, and since I haven't looked at their implementation I can't vouch for it.)

monotone is a seriously sweet piece of software, primarily in its powerful design. I wish there was some way to make it less confusing and complicated. I wonder how much of its complexity is just fundamental to the problem space it attacks...