Re: [PATCH/RFC] fmt-merge-msg: add a blank line after people info

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 15 May 2012 13:24:03 -0700

Junio C Hamano <gitster@xxxxxxxxx> writes:

> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>
>> Btw, the counting of commits is broken for the merge people. Do this
>> in the kernel tree, just to see an example of the breakage:
>> ...
>> I dunno. But it looks odd, and the above is not the only example of
>> "those counts don't make sense".
>
> "By" numbers were meant to give credits to people who wrote the code, and
> "via" numbers were meant to give credits to people who helped usher code
> by others' to the person who is making the merge.

I took a look at this again today.

The implementation you saw was written before I did any of the thinking
below, and it merely counts the committer of merges plus the committer of
the tip commit you are pulling, or something.  It is slightly better than
random number generator, but not by a huge margin.

Here is an outline of my current thinking to give a good definition for
the "via" number, which is supposed to give credits to lieutenants (and
possibly sublieutenants).

Suppose the history behind the tip commit you are pulling looked like
this:

          E-----E-----E-----E-----E
                                   \
            A/D--A/D                E
                   \                 \
       A/B---A/B----B-----B-----B-----C-----C-----C
                         /
      A-----A-----A-----A

where a commit denoted by a single letter (e.g. A on the bottom line) is
authored and committed by that person (by definition a merge is authored
and committed by the same person), and a commit deonted as X/Y was
authored by X and committed by Y.  You are responding to a pull request to
integrate the tip commit authored and committed by C into your history.

The contributor B helped by applying patches from contributor A (the
leftmost two patches on the middle line), merging the work authored by A
and vetted by D (the first merge on the middle line), and the work
authored by A (the second merge on the middle line).  He even fixed things
up with the rightmost commit in his history before asking C to pull.  He
should get the credit for all this work to help getting A's changes to the
history, including the two commits made by D (which owe credit to D as
well).

For the same reason why the two commits in D's history owe credits both to
B and D, the whole thing owes "via" credit to C, as C was the lieutenant
who was ultimately responsible for delivering this result to you (in other
words, he could have decided not to pull from B).

What I am thinking is for each commit X (not necessarily merges), count
non-merge commits that are:

 - reachable from X;
 - are being merged by the final merge;
 - not authored by the same author as X itself; and
 - have not been counted to give credit to the author of X yet. 

For example, the first two commits by B on the middle line will give 2
credits (because B helped A's patch by applying them), the first merge by
B on the middle line will give 2 credits (because it contributes another 2
commits by A via D to the final history) to B, the second merge will give
another 4 credits (commits on the bottom line) but not for the commits
that were already counted for his first merge.  Total credit to B is 8 in
this example.

The merge made by C will *count* all 8 commits by A (even though they are
credited also to B), 1 commit by B (i.e. fix-up after merging 4 commit
series from A), and 6 commits by E.  D gets 2 credits for having applied
two patches from A.  A and E will get no "via" credits.

While I find the double-counting that appear in the example somewhat
disturbing, it inherently give larger credit to sub-lieutenant that is
closer to the tip, so it might after all match intuition.

Now, computing this efficiently may not be trivial, as you would need N^2
reachability analysis when pulling in N commits.  Among 2000 recent merges
I sampled from the kernel history, 70+ pull in more than 1000 commits (the
largest one d4bbf7e77 pulls in 21k commits).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html