evan_tech

Previous Entry Share Next Entry
10:41 am, 10 Feb 07

monotone tech talk

(For some background, try my other posts on these systems, which span four years of history, yikes. My earliest post there I'm tentatively considering distributed version control as well as functional programming!)

There was a tech talk on monotone at work on Friday. It was kinda disastrous from a giving-talk standpoint, where the speaker was late (not his fault, I'm sure), then tried to project his slides with Linux (people always try this and it never works), then nobody had a laptop to lend him, then someone's Windows laptop decided to somehow die right as they were plugging it in, then someone's Mac laptop projected but the PDF-displayer program wasn't responding to key presses... I felt bad for the guy.

I was already familiar with most of monotone (and you can read their docs if you wanna learn more; it's pretty cool) but the presentation emphasized an aspect of monotone that I hadn't really considered. Since their model is much more about islands (computers) of source exchanging little bits of code, computer errors (network, security, disk) become more serious than the more traditional model of "keep the central repository backed up and on a secure/stable machine". Suppose Bob's disk has a bit error and that gets sent out to everyone, or I copy code from Eve who has tried to backdoor the project (as in the Linux kernel).

So the main point Nathaniel made that I hadn't really appreciated is the implicit security of identifying files and revisions by hashes. If you assume the hash function is secure (which all of this is predicated upon), then any modifications to a file cause the file's identity to change; any subsequent committed change ("revision") that involves that file is identified by a hash over data including that file's id, so the changedness bleeds into the revision; and any subsequent revisions that use that revision also mix in the hash... this cascades into the fact that every tiny bit change in any file bleeds all the way down into all subsequent revisions. And since everything else uses these hashes (like the netsync protocol), these sorts of errors become visible during normal operations. They mentioned that they've had users mail their list asking why monotone was complaining(?) about a file, ending with the user discovering their disk had introduced bit errors in the file.

Upon reflection, though, this protection is only against what I'd call "physical" attacks, and not the against the more "social" attack that I linked to above on the Linux kernel. If someone managed to steal a good committer's keys and stuff some bad revisions in, the only way anyone would notice is if they were reviewing all code they merge in.