Junio C Hamano <gitster@xxxxxxxxx> writes: > Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > >> Btw, the counting of commits is broken for the merge people. Do this >> in the kernel tree, just to see an example of the breakage: >> ... >> I dunno. But it looks odd, and the above is not the only example of >> "those counts don't make sense". > > "By" numbers were meant to give credits to people who wrote the code, and > "via" numbers were meant to give credits to people who helped usher code > by others' to the person who is making the merge. I took a look at this again today. The implementation you saw was written before I did any of the thinking below, and it merely counts the committer of merges plus the committer of the tip commit you are pulling, or something. It is slightly better than random number generator, but not by a huge margin. Here is an outline of my current thinking to give a good definition for the "via" number, which is supposed to give credits to lieutenants (and possibly sublieutenants). Suppose the history behind the tip commit you are pulling looked like this: E-----E-----E-----E-----E \ A/D--A/D E \ \ A/B---A/B----B-----B-----B-----C-----C-----C / A-----A-----A-----A where a commit denoted by a single letter (e.g. A on the bottom line) is authored and committed by that person (by definition a merge is authored and committed by the same person), and a commit deonted as X/Y was authored by X and committed by Y. You are responding to a pull request to integrate the tip commit authored and committed by C into your history. The contributor B helped by applying patches from contributor A (the leftmost two patches on the middle line), merging the work authored by A and vetted by D (the first merge on the middle line), and the work authored by A (the second merge on the middle line). He even fixed things up with the rightmost commit in his history before asking C to pull. He should get the credit for all this work to help getting A's changes to the history, including the two commits made by D (which owe credit to D as well). For the same reason why the two commits in D's history owe credits both to B and D, the whole thing owes "via" credit to C, as C was the lieutenant who was ultimately responsible for delivering this result to you (in other words, he could have decided not to pull from B). What I am thinking is for each commit X (not necessarily merges), count non-merge commits that are: - reachable from X; - are being merged by the final merge; - not authored by the same author as X itself; and - have not been counted to give credit to the author of X yet. For example, the first two commits by B on the middle line will give 2 credits (because B helped A's patch by applying them), the first merge by B on the middle line will give 2 credits (because it contributes another 2 commits by A via D to the final history) to B, the second merge will give another 4 credits (commits on the bottom line) but not for the commits that were already counted for his first merge. Total credit to B is 8 in this example. The merge made by C will *count* all 8 commits by A (even though they are credited also to B), 1 commit by B (i.e. fix-up after merging 4 commit series from A), and 6 commits by E. D gets 2 credits for having applied two patches from A. A and E will get no "via" credits. While I find the double-counting that appear in the example somewhat disturbing, it inherently give larger credit to sub-lieutenant that is closer to the tip, so it might after all match intuition. Now, computing this efficiently may not be trivial, as you would need N^2 reachability analysis when pulling in N commits. Among 2000 recent merges I sampled from the kernel history, 70+ pull in more than 1000 commits (the largest one d4bbf7e77 pulls in 21k commits). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html