Re: [PATCH 1/9 v4] bisect: add "git bisect replace" subcommand

Christian Couder <chriscool@xxxxxxxxxxxxx> · Wed, 12 Nov 2008 15:15:47 +0100

Le mercredi 12 novembre 2008, Junio C Hamano a écrit :
> Christian Couder <chriscool@xxxxxxxxxxxxx> writes:
> > This subcommand should be used when you have a branch or a part of a
> > branch that isn't easily bisectable because of a bug that has been
> > fixed latter.
>
> While I acknowledge your effort to make bisect easier to use, I do
> not think this is going in the right direction, from the point of view of
> the workflow.
>
> I do agree that the issue it tries to solve is a problem in real life.
> When you want to hunt for a bug, it is certainly possible that your tests
> fail for a bug that is unrelated to what you are hunting for for a range
> of commits.  Borrowing from your picture:
>
>     ...--O--A--X1--X2--...--Xn--B--...
>
> non of the commit marked as Xi may not be testable.
>
> But at that point, will you really spend time to rebuild history between
> A and B by fixing an unrelated bug that hinders your bisect, so that you
> can have a parallel history that is bisectable?  I doubt anybody would.

I think kernel developers and perhaps others do that somehow. I mean, there 
is the following text in the git-bisect(1) documentation:

"
You may often find that during bisect you want to have near-constant tweaks 
(e.g., s/#define DEBUG 0/#define DEBUG 1/ in a header file, or "revision 
that does not have this commit needs this patch applied to work around 
other problem this bisection is not interested in") applied to the revision 
being tested.

To cope with such a situation, after the inner git-bisect finds the next 
revision to test, with the "run" script, you can apply that tweak before 
compiling, run the real test, and after the test decides if the revision 
(possibly with the needed tweaks) passed the test, rewind the tree to the 
pristine state. Finally the "run" script can exit with the status of the 
real test to let the "git bisect run" command loop to determine the 
outcome.
"

So we suggest that people patch at bisect time in case of problems. But I 
think creating a parallel branch should be better in the long run, because 
you can easily keep the work you did to make things easier to bisect and 
you can easily share it with other people working with you. Also using "git 
rebase -i" to create the fixed branch might be easier and more reliable 
than trying to patch at each step.

Of course it also depends on how often people use "git bisect", but it seems 
that there are people out there bisecting very frequently and that these 
people care very much about bisectability of the tree.

> Even if we assume that somebody wants to adopt the workflow to first fix
> an unrelated bug (that may be totally uninteresting for the purpose of
> solving the original issue he set out to figure out) to rewrite the
> history, what he first needs to do is to find out what part of the
> history to rewrite.

Not necessarily. People may decide to adopt a workflow that consists in 
creating a fixed branch as soon as they know about a fix that may prevent 
their code from being tested. This way they know that they will be able to 
bisect at each commit of their full history.

> IOW, he needs to know A and B (and in general, the 
> history is not even linear).  Maybe he guesses what A and B is.  But for
> one thing, after making the guess, he would certainly test A and B to see
> if the original issue exists at these commits.  The sequence of commits
> Xi become irrelevant if A turns out to be bad or B turns out to be good.

Yes that's possible. But in some cases people might find it simpler (for 
example because it might take a long time or a lot of manpower to fully 
test one commit) to just create a fixed branch as soon as they find 
something that prevents testing which might happen during a bisection or 
not.

> And if A is good and B is bad, then the _original bug_ is in the very
> sequence of Xi you are going to rewrite.  By the time you made a
> rewritten history with sequence of commits Yi to be grafted like this:
>
>
>          C--Y1--Y2--...--Yn
>        /
>  ...--O--A--X1--X2--...--Xn--B--...
>
> to make it bisectable, it is very likely that you would have already seen
> the original bug.

I am not sure I understand/agree with that, because I think it may be quite 
easy to make such a fixed branch, especially if the commit message of B is 
a good one like "fix build problem on platform 'foo' introduced by A".

> In such a case where you need to figure out what an unrelated bug is, and
> which commit A and B are involved while bisecting, I think you are much
> better off using bisect skip, as Johannes mentioned earlier.

Even if there was a bisect skip command that could skip a range of commit, 
you might find in the end that the original bug you are looking for was 
introduced in a range you skipped.

> On the other hand, if you already have a well-known bug that was
> introduced at A whose fix at B is also very well-known, you would not
> even need a separate "bisect replace" command nor replace_parents()
> machinery only for the purpose of bisection, would you?  In such a case I
> think you can just use a usual graft.

I don't know very much usual grafts but I think that the problem is that 
usual grafts rewrite history for (nearly) all the commands.

It seems to me that with usual grafts for this purpose, people may for 
example have problems refering to some commits, because it might happen 
that the sha1 of a commit might be in a branch that has been grafted out 
the current branch for easier bisecting.

For example in my patch series, the 2 last patches add a --no-replace option 
to "git bisect", so that it's easy not to use "bisect-replace" branches if 
one does not want too. And I fear that if graft use is generalized, it may 
be necessary to add such options to many other commands (instead of just 
one). And even then, people may want to use some of the grafts but not 
others (for example grafts to see some very old commits imported after the 
initial repo was created but not grafts to fix history) for some purposes 
and a different set (for example all the grafts) for other purposes (like 
bisecting).

> I have a separate idea make 'grafts' easier on object transfer, that is
> somewhat related to this one, by the way.  Instead of making the grafts
> completely a local matter as we do now, we can reserve refs/replace/
> namespace, and record a new commit object to replace an existing commit
> whose object name is $sha1 as refs/replace/$sha1.  We make almost all the
> commands except object enumeration (fsck, receive-pack, send-pack, prune,
> etc.  Roughly speaking, anything that involves "rev-list --objects")
> honor this commit replacement, so that any time you ask for commit $sha1,
> the object layer gives you the replacement commit object back.  In this
> way, you can clone or fetch from such a repository (along with refs in
> refs/replace/ hierarchy) and fsck/prune won't lose the original parents
> (because it does not see replacements).  Things like paranoid update hook
> needs to become very careful about refs/replace/ for security reasons,
> but I think this would make the grafts much easier to use.

I agree that it would make grafts much easier to use (and would be very 
security sensitive).

But except for grafts used to connect to old histories, do you know many 
special commands other than "git bisect" that may use the refs/replace/ 
namespace?

I mean, it may be a good idea to use a special namespace with sha1 named 
refs like "refs/xxx/$sha1" instead of branches named 
like "bisect-replace-SHA1", but I think that it would be better if grafts 
used to fix history for bisecting purpose would be separated from other 
kind of grafts, for example perhaps in "refs/replace/bisect/$sha1".

Regards,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html