By the way, there's info on our wiki about how to set up git-svn such that you can fetch with a fast
git fetchfrom the mirror while still using the slow SVN server when it's time to commit.
This has been working fine for quite a while but I noticed that occasionally (rarely) it was getting the proper commit data but the author wrong.
$ echo $(git rev-list --author=chrome-bot origin | wc -l) $(git rev-list origin | wc -l)
Half a percent of commits.
I asked around and the best guess is this surprising gotcha: SVN commits aren't atomic. :(
The author metadata is a separate property of a commit and so it's possible for my mirror to grab a commit before the author data has synced over.
What's the fix? svnsync puts a lock in the repo before syncing. Right now I check the lock. To be correct I'd need to grab the svnsync lock myself while I'm doing my copy. Another option is to rewind and try again whenever I see a bad commit get mirrored, but git-svn doesn't really like having history rewound without clobbering its metadata and I can't let it just rebuild its metadata from the commit history for complicated reasons outside the scope of this post.
In summary, now I have this git repo that has the wrong authors in some commits. Fixing it would require rebuilding history from the earliest instance of the problem, invalidating everyone else's copies. I haven't done it since I'm not convinced it's too important. Now that I look at the logs, it seems to have gotten much worse recently...