Re: [PATCH] rebase: be cleverer with rebased upstream branches

Santi Béjar <santi@xxxxxxxxxxx> · Mon, 14 Mar 2011 00:42:12 +0100

On Sun, Mar 13, 2011 at 11:57 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Martin von Zweigbergk <martin.von.zweigbergk@xxxxxxxxx> writes:
>
>> So here are the updated figures for the force-updated history
>> (pu-like):
>>
>>                  u@{10}     u@{100}    u@{1000}
>> manual         0m0.535s    0m1.164s    0m1.415s
>> linear         0m1.245s   0m37.367s   5m10.068s
>> merge-base    0m14.490s   0m15.409s   0m15.508s
>> exp(1,2)       0m1.056s    0m6.175s   0m27.221s
>> exp(10,10)     0m1.950s   0m20.031s   0m18.215s
>> exp(7,7)       0m1.310s    0m6.851s   0m16.757s
>>
>> and for the non-force-updated history (master-like):
>>
>>                  u@{10}     u@{100}    u@{1000}
>> manual         0m0.885s    0m6.126s   0m52.248s
>> linear         0m1.349s   0m39.688s   5m28.753s
>> merge-base     0m1.160s    0m1.699s    0m1.901s
>> exp(1,2)       0m0.769s    0m4.342s    0m7.360s
>> exp(10,10)     0m0.700s    0m2.535s    0m3.110s
>> exp(7,7)       0m0.653s    0m2.332s    0m3.506s
>
> I have a suspicion that merge-base shouldn't be taken as "one of the
> candidates among others".
>
> It is merely a technique to check multiple heads simultaneously and
> cheaply, instead of checking things one by one, and should be applicable
> regardless of these "multiple heads" come from.  It may come from a linear
> walk of reflog, or it may come from a leaky exponential or fibonacci walk.

Here the merge-base strategy mean take all the reflog at once. And in this
case the reflog is 1000 entries long, it is only relevant to the u@{1000}
case. For me it is more like: take the solution and check that indeed it is
the solution. In this sense it is the "minimum" time required to find the
solution.

>
> Instead of "linear" that checks "Is the tip good?  Is the tip@{1} good?
> Is the tip@{2} good? How about tip@{3}?" repeatedly, we can check "is the
> tip through tip@{N} good?" with a single invocation using the merge-base
> technique.

exp(n,m) is similar to this. But, yes, we could have a merge-base(N)
strategy which checks using the merge-base technique the first N reflog
entries, then the next N entries, and so on. But I think it would scale worst
then the exponential strategy.

>
> Similarly if your exp(i,j) checks "Is the tip good? tip@{1}? tip@{2}?
> tip@{4}?  tip@{8}? ..."  iteratively, you can grab these commit object
> names out of reflog and still use the merge-base optimization, effectively
> making "one at a time" check into "N at a time" check (perhaps checking a
> dozen or two at a time).

Indeed it is "Is the tip good? tip@{1},tip@{2}? tip@{4},tip{5},...,tip@{8}?
... and the merge-base techinque is used to answer each question.

>
> If you generate list of reglog entries way beyond what matters to the end
> result, and feed all of them to the machinery, it is not surprising that
> it would spend more time than linear that can stop early upon the first
> success.

But if the answer is the last entry in the reflog (as in the u@{1000} case) it
is a lower limit, see above.

>
> We should optimize for common case by picking the default that performs
> well for history that is never rewound.  If the the default algorithm on
> pathological histories performs badly, it is Ok to have an alternative
> heuristics that is only triggered on such a history, but if we are going
> to do so, we need to make sure that we can cheaply detect if we need to
> use the alternative heuristics in the first place.

Maybe I'm wrong, but from the number I see that the exp strategy this
"optimize for the common case and works reasonably work for pathological
histories".

>
> Is it easy to tell in an earlier phase if the history is "force-updated"
> or not?
>

I'm afraid this is a similar question.

HTH,
Santi
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html