04:35 pm, 20 Apr 08
distributed bug tracking
Distributed bug tracking is the natural extension of distributed version control. Aside from the normal benefits of distributed version control, like being able to interact with bugs database while offline, there also seems to be a trend of making the interface to them work via the command line instead of annoying web-based systems. And, like with Trac, the integration of issue tracking with the source is pretty natural: when you've fixed a bug on a branch you can mark the bug as fixed in that branch, and when that branch lands on your "main" tree that tree's bug state is also merged as a natural consequence of how merges work.
It seems like there isn't dominant software for this yet. Here's my five-minute take on the software I can find:
From reading through these I find a surprising variety of models. Here's what seems to me to be the simplest and sanest model: the bugs live in a normal top-level directory in your tree alongside "src" or whatever other directories you have; each issue is in its own file; comments are modifications of the per-issue file.
But more generally, I'm not even sure if distributed is the appropriate model. The action of recording a new bug modifies the current version of a branch but the bug's existence usually is older than the most recent commit (for example, it often belongs in older branches that have branches off before the bug was added). So if a new bug is fixed in an older branch, there's no way to merge that new bug into the older branch without merging the entire state of the newer branch in. Is that sensible? I'm not sure. The alternatives all seem to involve tracking bugs separately from branches and trying to match them up after you commit (like when commit messages mention bug numbers) which always feels like a failure of technology.
The other issue that's I'm unsure about is how to integrate a sane web-based frontend for casual users who want to be able to query and report bugs without checking out the code. Some systems have web frontends but it seems to me conflicts could be hard to resolve. Maybe if you make sure a modification to an issue is always appends, and then add some smarts that auto-merges simultaneous adds by some textual timestamp included in the add.
Needs more thought. Sorry for the braindump.
* I'm a vim user, and don't really care about editor wars, so I mention it only to note that emacs integration isn't as useful for me.
It seems like there isn't dominant software for this yet. Here's my five-minute take on the software I can find:
- Fossil is its own full version control system that integrates a bug tracker as well
- bugs everywhere -- perhaps abandonware, last commit was July 2007
- DITrack -- subversion only, "planning to be backend agnostic" (not sure how svn matches up with distributed, but ok)
- DisTract -- only a web interface using Firefox-specific Javascript to write to disk(?), requires monotone, latest news August 2007
- TicGit -- just learned about it five minutes ago so not sure yet; seems a bit janky to keep bugs in a separate branch
- Ditz -- seems the most appealing to me except that it's all of three weeks old, has emacs integration*, last commit last week
From reading through these I find a surprising variety of models. Here's what seems to me to be the simplest and sanest model: the bugs live in a normal top-level directory in your tree alongside "src" or whatever other directories you have; each issue is in its own file; comments are modifications of the per-issue file.
But more generally, I'm not even sure if distributed is the appropriate model. The action of recording a new bug modifies the current version of a branch but the bug's existence usually is older than the most recent commit (for example, it often belongs in older branches that have branches off before the bug was added). So if a new bug is fixed in an older branch, there's no way to merge that new bug into the older branch without merging the entire state of the newer branch in. Is that sensible? I'm not sure. The alternatives all seem to involve tracking bugs separately from branches and trying to match them up after you commit (like when commit messages mention bug numbers) which always feels like a failure of technology.
The other issue that's I'm unsure about is how to integrate a sane web-based frontend for casual users who want to be able to query and report bugs without checking out the code. Some systems have web frontends but it seems to me conflicts could be hard to resolve. Maybe if you make sure a modification to an issue is always appends, and then add some smarts that auto-merges simultaneous adds by some textual timestamp included in the add.
Needs more thought. Sorry for the braindump.
* I'm a vim user, and don't really care about editor wars, so I mention it only to note that emacs integration isn't as useful for me.
Bug tracking is not just a programmer function
Code control is a purely technical function. Distribution/isolation works fine for that, as long as the team communicates reasonably well so they don't duplicate work (or have infinite free resources like open source projects)..However, bug tracking is actually a communication tool more than a programmer tool. Decentralisation of bug tracking is not at all an obvious way to go. It'd be a bit like having a decentralised email system, where you only get emails once I "push" them out to the master server. Funnily enough, that's how email works, but we've gotten around that by basically doing a push and pull every 5 minutes, or even continually.
Imagine how bad email would get if the average person only clicked on Send/Receive once every few days - this would defeat most of the purpose of that communication tool. The same goes for bug tracking, imho.
Daniel
Re: Bug tracking is not just a programmer function
Except, when you’re doing distributed version control, then which view of the code does a bug apply to? It may exist in some branches but not in others; moreover, it may be fixed in some but not yet in others; in some of them it will possibly never be fixed at all, if they’re merged down before the bug is fixed.
So you need to associate any state and state change of a bug with the branches it affects. It would seem that keeping the bug database right inside the source base would be the most natural way to address this.
Re: Bug tracking is not just a programmer function
At least in the way I use distributed version control, when I'm online I sync pretty frequently: like at about the frequency I check my email, perhaps. When I'm offline I wouldn't be able to consult an online-only BTS anyway.There's a third state I have is "online, but effectively offline in that I'm hacking on something not yet worth talking about" -- a local branch -- and in that state I'm also not interacting with a BTS.
Sorry for what exactly? I find braindumps from people thinking about the same things as me, especially in areas of active exploration, one of the more useful forms of weblogging.
As for modifying the current version of a branch… it kinda seems to me like git’s history rewriting features are the right way to handle this, but at the same time this somehow feels wrong. Hm.
Oh d’uh. I knew I had to be missing something really obvious.
I have been thinking about the “model” for bugs. The problem is that it isn’t as well-defined a concept as one would wish. For small things, there is an obvious model hinted at by
bisect: a bug is a property of the commit that introduces it, as is the fix. If you attach bugs and their fixes to commits they would automatically “infect” all the branches that grow from the commits in question. This is what led me to think of rewriting history.But that’s not right anyway. One problem is that some bugs cannot be pinned to a single commit. Which commit introduces an architectural misdesign? Another is that even for those bugs that are sufficiently narrowly defined, you may not yet know at the time of reporting which commit introduced the bug. Furthermore, there is all this communication about the bug resolution process that should be recorded somewhere/-how, which is really only indirectly a property of the bug (itself, in turn, a property of the code).
So right now I’m thinking that maybe centralised bug tracking is still the right answer, but that it needs to learn how to refer to commits. Also, there need to be BTS↔DVCS integration tools that can automatically figure out which branches inherit which of the bug’s state changes, so that they can accurately summarise which bugs apply to any one commit, eg. when using a repo browser to look at history.
Does that sound to you like getting closer?
One would be the case for making bug tracking distributed, which seems pretty simple. It'd be more like batching bug reports than anything else, as each user committed bugs into their local database and then pushed/pulled to distribute them around the network. You'd have the normal human pain of handling merge cases for duplicate bugs, but that's already needed in most bug models. External users could interface through a dummy node somewhere hooked to the web on one side and available for others to pull updates from on the other. Seems like TacGit (the one I looked at more in-depth) could handle this.
The other side of things you're talking about applies even to non-distributed systems, although it's exacerbated by some of their peculiarities. You'd like to have some sense of bugs as objects existing in the historical record - primarily so that they are automatically inherited by descendant branches? You can't rewrite commit history, of course, and in a distributed system, even if you could, you could never be sure that you distributed it to every branch. In theory you could get away with this in something like SVN/CVS.
So how about a parallel system? The "bug object" satisfying criteria #1 could also attempt to contain a map of all of the historical record it can discover. That map would give you a place to assign bugs that could act as a pointer into the commit structure where the bugs actually live (and where they were resolved). You could create a new "bug map" for all of the code repository your local machine knew about, and then when somebody else pulled it they could merge that into their "bug map" and identify duplicates as normal.
Seems like a good sign for the concept to me. :-)
Bug tracking is a QA tool
On projects where I have used Trac, the bug team did not. It was really just a fancy TODO list for devs.A tracker should be made for those who use it most - the Testers. PM's want reports, UAT Scripts, regression reports, etc. Issue tracking is about 1/2 of it.
Developers generally get it right when they write their own tools and they get it wrong when they write other peoples tools.
How I would like to see if working
Recently I've been using Trac alongside Subversion. Although I've used a few other defect trackers and version control systems through the years. I'm keen to move to using a DVCS, but at work that is not likely just yet.However I have a vision in my mind or how DVCS should work. I really like the tight integration that Trac and Subversion have, and I'm looking forward to trying Trac+Mercurial. So much better than ClearCase and ClearQuest.
The work flow that I seem to use currently is:
But in the past I have used a more complex model, in the "branch per task" pattern:
Branch per task adds more work, but it enables the trunk to be kept clean with a large number of developers. Which is important in a bigger project. Although if you make some fixes, and they sit around for a day or so before you get the chance to intergrate them to trunk, then you can forget about all the defects fixed there, so it would be good to be able to tag the branch with the defect/ticket info.
In both cases it would be good to be able to annotate the branch or changeset metadata with the ticket info. Then when Trac notices the changeset on it's timeline, it cross-references it's ticket to the branch or changeset.
This way all the bulk of ticket information is stored centrally in a web-accessable manner, but the details of the fixes and the relationship to the ticket is tracked in the VCS.
As the changeset get merged through dev, test, and release branches, so does the tickt info. So one can see from a report which tickets have been delivered to test and which has made it to production.
2) I don't think a distributed BTS should fetch new bugs which are resolved already.
>
> [...] I don't think I understand your second point.
Suppose, I've spotted a bug and report it to my private "repository". In a day or two I have it fixed.
Now. "Upstream" [distributed] BTS knows nothing about my bug. And this makes very little sense for this bug to get committed into upstream repository. That's what I meant...
Any BTS, whether distributed or not, should list outstanding bugs. But the ones discovered somewhere and resolved already... historical garbage to be ignored.
Does this make any sense?
If you were using a centralized BTS, would you report and then close the bug on the centralized system? If so, then syncing your closed bug up to the server is the right thing, or at least the same behavior. Otherwise, you shouldn't have marked the bug locally either.
It's not garbage. It has the same use as all the other bugs that were open for a while and then closed. Do you propose all BTSs should delete closed bugs?
I'd expect my BTS to be able to answer the question: has there ever been a bug in this component with this keyword and what happened to it?
Not garbage at all. A published commit never changes; it’ll always have the previously noted defect. Anyone looking at a historical commit for whatever reason should therefore be able to find out what defects affect it.
Thanks.
I think the Bugs Everywhere model is the nicest I've seen so far, namely that it supports many dVCS backends (arch, bazaar, git, mercurial) and many user interfaces (command-line, web, GUI).
- Chris.
(And man, major WTF at using Bazaar. :) I'm already kind of geeky about these systems and I hadn't ever been forced to install it to look at anything before...)
My preference would be for the web interface to operate as just another client performing merges with the underlying repository, and perhaps having it merge before a new transaction. Having a good UI for resolving conflicts should make everything okay.
- Chris.
Sure, not allowing two comments to be added at the same time is a primitive mechanism, that is true.
Oh, me too -- so much so that I consider getting Bugs Everywhere to run as a Trac backend as being something close to priority #1. It's an upgrade path from Trac-as-centralized-datastore that doesn't lose the convenient UI.
- Chris.