Hi all I think I'm fundamentally misunderstanding something about the blame code... The other day I wanted to see how much our local fork of DOMjudge diverged from their upstream. You can grab the entire history at git://csa.inf.ethz.ch/domjudge-public.git if you want to try the commands I ran. As a first statistic I looked at how many lines are blamed to our local team (Christoph, Florian and me) by running git ls-files | while read f; do git blame -M -- "$f"; done | perl -pe 's/^\^?[a-f0-9]* (?:[^(]* )?\(([^2]*?) *20.*/$1/' | sort | uniq -c | sort -n This shows that over 8000 lines are attributed to the three of us: 1 domjudge 2 rob 113 Stijn van Drongelen 126 Jeroen Schot 149 neus 866 Peter van de Werken 1245 Thomas Rast 1752 Christoph Krautz 5350 Florian Jug 10293 Thijs Kinkhorst 20397 Jaap Eldering However, sanity checking this against the diffs of the single commits shows quite a different number: git log --no-merges -p upstream/2.2.. | grep '^+' | grep -v -c '^+++' gives only 4943 '+' lines, and you can easily verify with git shortlog -sn upstream/2.2.. that indeed all commits in that range are ours. So why does the blame think more lines are ours than we even added *in total*? Björn Steinbrink suggested on IRC that I use -M5 -C5 -C5 -C5, which indeed reduces it to 1 domjudge 2 rob 115 Stijn van Drongelen 116 Jeroen Schot 149 neus 390 Florian Jug 930 Peter van de Werken 1209 Thomas Rast 1612 Christoph Krautz 11750 Thijs Kinkhorst 24020 Jaap Eldering Note especially the huge drop in Florian's numbers. What's going on here? -- Thomas Rast trast@{inf,student}.ethz.ch
Attachment:
signature.asc
Description: This is a digitally signed message part.