On Sun, Mar 13, 2011 at 11:57 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Martin von Zweigbergk <martin.von.zweigbergk@xxxxxxxxx> writes: > >> So here are the updated figures for the force-updated history >> (pu-like): >> >> u@{10} u@{100} u@{1000} >> manual 0m0.535s 0m1.164s 0m1.415s >> linear 0m1.245s 0m37.367s 5m10.068s >> merge-base 0m14.490s 0m15.409s 0m15.508s >> exp(1,2) 0m1.056s 0m6.175s 0m27.221s >> exp(10,10) 0m1.950s 0m20.031s 0m18.215s >> exp(7,7) 0m1.310s 0m6.851s 0m16.757s >> >> and for the non-force-updated history (master-like): >> >> u@{10} u@{100} u@{1000} >> manual 0m0.885s 0m6.126s 0m52.248s >> linear 0m1.349s 0m39.688s 5m28.753s >> merge-base 0m1.160s 0m1.699s 0m1.901s >> exp(1,2) 0m0.769s 0m4.342s 0m7.360s >> exp(10,10) 0m0.700s 0m2.535s 0m3.110s >> exp(7,7) 0m0.653s 0m2.332s 0m3.506s > > I have a suspicion that merge-base shouldn't be taken as "one of the > candidates among others". > > It is merely a technique to check multiple heads simultaneously and > cheaply, instead of checking things one by one, and should be applicable > regardless of these "multiple heads" come from. It may come from a linear > walk of reflog, or it may come from a leaky exponential or fibonacci walk. Here the merge-base strategy mean take all the reflog at once. And in this case the reflog is 1000 entries long, it is only relevant to the u@{1000} case. For me it is more like: take the solution and check that indeed it is the solution. In this sense it is the "minimum" time required to find the solution. > > Instead of "linear" that checks "Is the tip good? Is the tip@{1} good? > Is the tip@{2} good? How about tip@{3}?" repeatedly, we can check "is the > tip through tip@{N} good?" with a single invocation using the merge-base > technique. exp(n,m) is similar to this. But, yes, we could have a merge-base(N) strategy which checks using the merge-base technique the first N reflog entries, then the next N entries, and so on. But I think it would scale worst then the exponential strategy. > > Similarly if your exp(i,j) checks "Is the tip good? tip@{1}? tip@{2}? > tip@{4}? tip@{8}? ..." iteratively, you can grab these commit object > names out of reflog and still use the merge-base optimization, effectively > making "one at a time" check into "N at a time" check (perhaps checking a > dozen or two at a time). Indeed it is "Is the tip good? tip@{1},tip@{2}? tip@{4},tip{5},...,tip@{8}? ... and the merge-base techinque is used to answer each question. > > If you generate list of reglog entries way beyond what matters to the end > result, and feed all of them to the machinery, it is not surprising that > it would spend more time than linear that can stop early upon the first > success. But if the answer is the last entry in the reflog (as in the u@{1000} case) it is a lower limit, see above. > > We should optimize for common case by picking the default that performs > well for history that is never rewound. If the the default algorithm on > pathological histories performs badly, it is Ok to have an alternative > heuristics that is only triggered on such a history, but if we are going > to do so, we need to make sure that we can cheaply detect if we need to > use the alternative heuristics in the first place. Maybe I'm wrong, but from the number I see that the exp strategy this "optimize for the common case and works reasonably work for pathological histories". > > Is it easy to tell in an earlier phase if the history is "force-updated" > or not? > I'm afraid this is a similar question. HTH, Santi -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html