Re: Our merge bases sometimes suck

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/17/2014 05:08 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@xxxxxxxxxxxx> writes:
> 
>> The "best" merge base
>> =====================
>>
>> But not all merge bases are created equal.  It is possible to define a
>> "best" merge base that has some nice properties.
>>
>> Let's focus on the command
>>
>>     git diff $master...$branch
>>
>> which is equivalent to
>>
>>     git diff $(git merge-base $master $branch)..$branch
>> ...
>> I propose that the best merge base is the merge base "candidate" that
>> minimizes the number of non-merge commits that are in
>>
>>     git rev-list --no-merges $candidate..$branch
>>
>> but are already in master:
>>
>>     git rev-list --no-merges $master
> 
> I welcome this line of thought very much.
> 
> There is one niggle I find somewhat curious but am either too lazy
> or too stupid to think it through myself ;-)
> 
> The "merge-base" is a symmetric operation, because the three-way
> merge, which is the primary customer of its result, fundamentally
> is.  From your description, it sounds like the "best" merge base
> however may not be symmetric at all.  The merge-base between A and B
> that makes "git diff A...B" the easiest to read by minimizing the
> distance between it and B may be different from the merge-base
> between A and B that makes the other diff "git diff B...A" the
> easiest to read.
> 
> Or it may not be assymmetric---that is why I said I didn't think it
> through.  I am not saying that it is bad if the "best" merge-base is
> an asymmetric concept; I am curious if it is asymmetric, and if so
> if that is fundamental.

It just looks asymmetric, but actually it is symmetric, which was kindof
surprising when I realized it.  The argument is in the next section
"Symmetry; generalization to more than two branches".  Michael Gruber
showed the same thing upthread using set notation, which is easier to
follow.  Here is his argument in symbolic notation.  We want to minimize

    N = |(branch - candidate) ∧ master|

where "branch" represents the set of all commits in "branch" etc, "|x|"
represents the number of elements in set "x", and "∧" is set
intersection, and candidate is a merge base of branch and master.

    N = |(branch ∧ ∼candidate) ∧ master|
      = |(branch ∧ master) ∧ ∼candidate|

Since candidate is a common ancestor of branch and master,

    candidate ⊆ branch ∧ master

so we have

    N = |branch ∧ master| - |candidate|

Since "|branch ∧ master|" is the same for all candidates, minimizing N
is the same as maximizing |candidate|, which is the same as

    git rev-list --count --no-merges $candidate

.  This is clearly symmetric in master vs. base.

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]