Re: [PATCH] rebase: be cleverer with rebased upstream branches

Martin von Zweigbergk <martin.von.zweigbergk@xxxxxxxxx> · Tue, 15 Feb 2011 22:03:23 -0500 (EST)

On Tue, 15 Feb 2011, Junio C Hamano wrote:

> Martin von Zweigbergk <martin.von.zweigbergk@xxxxxxxxx> writes:
> 
> > diff --git a/git-rebase.sh b/git-rebase.sh
> > index 5abfeac..1bc0c29 100755
> > --- a/git-rebase.sh
> > +++ b/git-rebase.sh
> > @@ -466,6 +466,19 @@ esac
> >  
> >  require_clean_work_tree "rebase" "Please commit or stash them."
> >  
> > +test -n "$upstream_name" && for reflog in \
> > +	$(git rev-list -g $upstream_name 2>/dev/null)
> 
> Ugly.

Very. Fixed. Thanks.

> 	test -n "$upstream_name" &&
>         for reflog in $(git rev-list ...)
>         do
>         	...
> 	done
> 
> Don't you need to make sure $upstream_name is a branch (or a ref in
> general that can have a reflog), or does it not matter because the
> "rev-list -g" will die without producing anything and you are discarding
> the error message?

Exactly as you suspect. Is it too ugly?

> Now, a handful of random questions, none of them rhetorical, as I don't
> know the answers to any of them.
> 
> Would it help if the code is made just as clever as the patch attempts to
> be, when the user says
> 
> 	git rebase origin/next~4
> 
> IOW, use the reflog of origin/next even in such a case?

Not sure. I think it seems too rare to worry about. In those cases,
one could still use the good old '--onto' option manually. Also, if we
don't handle the ref~4 case, the "cleverness" can be disabled by using
ref~0.

> > +do
> > +	if test $reflog = $(git merge-base $reflog $orig_head)
> > +	then
> > +		if test $reflog != $(git merge-base $onto $reflog)
> > +		then
> > +			upstream=$reflog
> > +		fi
> > +		break
> > +	fi
> 
> Do we always traverse down to the beginning of the reflog in the worst
> case?

Yes.

> Would bisection help to avoid the cost?

I don't think the straight-forward use of bisection would work. If the
history looks something like below, where 'b' is the branch to rebase
and 'u' is the upstream, we have to go through each entry in the
reflog to find u@{3}.

        .-u@{0}
       /
      .---u@{1}
     /
x---y-----u@{2}
     \
      .---u@{3}---b
       \
        .-u@{4}

I have an idea inspired by bisection, Thomas's exponential stride, and
what someone (you?) mentioned the other day about virtual merge
commits. I haven't tried it out, but let me know what you think. I'll
try to explain it using an example only:

Exponential stride phase:
1. candidates={ u@{0} }
   merge-base b $candidates -> y, _not_ in $candidates
2. candidates={ u@{1} u@{2} }
   merge-base b $candidates -> y, _not_ in $candidates
3. candidates={ u@{3} u@{4} u@{5} u@{6} }
   merge-base b $candidates -> u@{3}, in $candidates
Bisection phase:
1. candidates={ u@{3} u@{4} }
   merge-base b $candidates -> u@{3}, in $candidates
2. candidates={ u@{3} }
   merge-base b $candidates -> u@{3}, in $candidates, done

It works for the few cases I have thought of, but it may break in
other other cases. I just read about the virtual merge commits, so I'm
not sure I understand correctly how that works eiter.

Would it even perform better than searching linearly? I tried stepping
through it manually a few times and it seems faster.

Maybe something based on timestamps would be better?

/Martin
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html