On Sat, Oct 23, 2010 at 11:52:26AM +0000, Ãvar ArnfjÃrà Bjarmason wrote: > Either way it doesn't matter, since I'm not interested in being a SFC > liasion. I just want to hack, not deal with issues like these (but > more power to people who want to). I didn't mean to pick on you, btw. It's just that I was surprised to see you, whose first commit was only 6 months ago, in the list of top contributors by lines of code. You're productive, but not _that_ productive. :) As it turns out, even though Junio's numbers are doubled, you are in fact high by line count. It's because of compat/regex: $ git log --pretty=format: --numstat --author=Bjarmason compat/regex/* | perl -ne '/^\d+/ and $total += $&; END { print "$total\n"; }' 11186 which accounts for 85% of your contribution. :) > But I think picking people for anything based on the number of lines > that git-blame thinks people "own" is a bad criteria. My contributions > to Git are relatively small, but I've happened to pick projects (the > test suit, gettext) that have touched a lot of lines of code. > > But other people who've done 10x more work than I have (both in time & > importance) probably have 10x less lines of code assigned to them. I think counting surviving lines via git-blame is not that bad a metric for importance. It's certainly better than counting added lines (as I did above), as it measures lines that people are actually still using. The problem here is that we have quite large chunks of "uninteresting". Junio made some attempt to account for this by counting various parts of the codebase separately. Probably compat/ should have been removed from the core count (ditto for Marius Storm-Olsen, whose line count is inflated by importing nedmalloc; which isn't to say that any of these contributions aren't important. They just aren't the same as sitting down and writing 10,000 lines of custom git code). In general, any line count of code (surviving or otherwise) will favor people who are adding features rather than fixing bugs. I prefer commit count, where I personally fare much better. :) One interesting metric to me is the ratio of commit log lines to code lines. A high ratio implies (to some degree) working on bugfixes, where the actual changed lines of code are less important than the time you spend figuring out _which_ lines to change. You can measure it with something like: $ git log --format='Author: %an%n%w(0,4,4)%B' --numstat --no-merges | perl -ne ' if (/^Author: (.*)/) { $author = $1 } elsif (/^\s{4}.+/) { $commit{$author}++ } elsif (/^\d+/) { $code{$author} += $& } END { print($commit{$_} / $code{$_}, " $_\n") for grep { $code{$_} } keys(%code) } ' | sort -rn Of course it has its own set of flaws. One giant feature contribution can outweight a lot of bugfixes in the average. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html