On Tue, Aug 14, 2018 at 01:43:38PM -0400, Derrick Stolee wrote: > On 8/13/2018 5:54 PM, Jeff King wrote: > > So I try not to think too hard on metrics, and just use them to get a > > rough view on who is active. > > I've been very interested in measuring community involvement, with the > knowledge that any metric is flawed and we should not ever say "this metric > is how we measure the quality of a contributor". It can be helpful, though, > to track some metrics and their change over time. > > Here are a few measurements we can make: Thanks, it was nice to see a more comprehensive list in one spot. It would be neat to have a tool that presents all of these automatically, but I think the email ones are pretty tricky (most people don't have the whole list archive sitting around). > 2. Number of other commit tag-lines (Reviewed-By, Helped-By, Reported-By, > etc.). > > Using git repo: > > $ git log --since=2018-01-01 junio/next|grep by:|grep -v > Signed-off-by:|sort|uniq -c|sort -nr|head -n 20 At one point I sent a patch series that would let shortlog group by trailers. Nobody seemed all that interested and I didn't end up using it for its original purpose, so I didn't polish it further. But I'd be happy to re-submit it if you think it would be useful. The shell hackery here isn't too bad, but doing it internally is a little faster, a little more robust (less parsing), and lets you show more details about the commits themselves (e.g., who reviews whom). > 3. Number of threads started by user. You have "started" and "participated in". I guess one more would be "closed", as in "solved a bug", but that is quite hard to tell without looking at the content. Taking just the last person in a thread as the closer means that an OP saying "thanks!" wrecks it. And somebody who rants long enough that everybody else loses interest gets marked as a closer. ;) > If you have other ideas for fun measurements, then please let me know. I think I mentioned "surviving lines" elsewhere, which I do like this (and almost certainly stole from Junio a long time ago): # Obviously you can tweak this as you like, but the mass-imported bits # in compat and xdiff tend to skew the counts. It's possibly worth # counting language lines separately. git ls-files '*.c' '*.h' :^compat :^contrib :^xdiff | while read fn; do # eye candy echo >&2 "Blaming $fn..." # You can use more/fewer -C to dig more or less for code moves. # Possibly "-w" would help, though I doubt it shifts things more # than a few percent anyway. git blame -C --line-porcelain $fn done | perl -lne '/^author (.*)/ and print $1' | sort | uniq -c | sort -rn | head The output right now is: 35156 Junio C Hamano 22207 Jeff King 17466 Nguyễn Thái Ngọc Duy 12005 Johannes Schindelin 10259 Michael Haggerty 9389 Linus Torvalds 8318 Brandon Williams 7776 Stefan Beller 5947 Christian Couder 4935 René Scharfe which seems reasonable. -Peff