On Mon, Mar 16, 2009 at 10:37 AM, John Tapsell <johnflux@xxxxxxxxx> wrote: > I think it's reasonable to expect false-positives as well as > false-negatives. e.g. you're looking for a commit that slows down the > frame rate. But on one of the good commits the hard disk hits a bad > sector and takes a bit longer to retrieve data and so you get a > false-positive. It's true that you could get false positives, as you say. What's less obvious to me is whether it would be a good idea for the algorithm to try to deal with them, or just report the earliest revision that failed and leave it up to the intelligence of the user to decide whether it is a false positive, and what to do about it. In the absence of some user-provided way of discriminating, the only way I can see for the algorithm can distinguish between revisions affected by the real bug as opposed to ones affected by false positives is to assume that the false positives occur at some lower rate. There are two difficulties with this: first, presumably it would have to start sampling more times at some locations in order to figure out what the rate is at them. This sounds like it would be expensive - currently the algorithm can usually get away with looking at most locations no more than once. Secondly, I'm not sure how to justify this assumption, or model it. In short, false positives look like a can of worms to me; I'm hoping the algorithm is useful without considering them. The algorithm actually has one potentially problematic assumptions already - or rather, it has two alternative assumptions, neither of which is completely believable. It can either assume that the bug causes faults at the same rate in all affected revisions, or that the affected revisions each have their own completely independent rate. Originally I thought that the latter would be the more conservative assumption - it certainly assumes less. However, the following argument convinces me that the other one is actually more conservative: Suppose that in the latest revision, we observe a fault in one run out of ten. Under the second assumption, this observed rate has no effect on our belief about the fault rate in other affected revisions, if any. This means that with a uniform prior on the fault rate, we more or less start out assuming a fault rate of 50% on any other affected revisions - much higher, implausably so. If any of them are only affected at a rate of one in ten, the algorithm is quite likely to terminate without seeing a fault there, concluding that the bug was introduced later than it really was. On the other hand, we know quite well that the fault rate isn't necessarily going to be identical either. Of the two, I think the assumption of identical rates is the more practical one, more likely to actually identify the correct location. It does leave me wondering whether some intermediate assumption would more accurately represent our experience of fault rates, but I haven't thought of a really convincing one. Ealdwulf. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html