Re: [RFH] git cherry vs. git rev-list --cherry, or: Why does "..." suck?

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 23 Mar 2011 11:20:54 -0700

Michael J Gruber <git@xxxxxxxxxxxxxxxxxxxx> writes:

> Adding some recent insight:
>
> Michael J Gruber venit, vidit, dixit 22.03.2011 13:07:
>> Performance
>> ===========
>> 
>> I don't get this:
>> 
>> git cherry A B: 0.4s
>> git rev-list --cherry A...B: 1.7s
>> (more details below)
>
> I can get the latter down to 0.95s and this
>
>> merge-base A B: 0.95s
>> merge-base --all A B: 0.95s
>> rev-parse A...B: 0.95s
>
> to 0.16s each. The downside is that merge-base may give a few
> unneccessary candidates (commits which are ancestors of other commits it
> returns), but this does not change the results for rev-list, of course.
>
> I get this dramatic speedup by removing the check for duplicates from
> get_merge_bases_many() in commit.c. After a first merge_bases_many()
> run, returning N commits, that check calls merge_bases_many() again for
> each pair (N choose 2) to check whether one is contained in the other.
> Quite a bottleneck. Removing it works great. But can we live with a few
> additional merge bases?

When we run merge-base as the top-level command (this includes
reduce_heads() that is used by "git merge"), we have to cull unnecessary
phantom bases that can be reached by other bases, so you are not allowed
to make such a change unconditionally.

Passing down a parameter from a caller that is prepared to handle phantom
merge bases correctly is probably the right approach.  Existing callers
can make "safer" calls for now; you can later examine them and turn them
into "faster" calls if they operate correctly given a result that contain
phantom bases.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html