On Fri, Apr 30, 2010 at 3:45 PM, Avery Pennarun <apenwarr@xxxxxxxxx> wrote: > On Thu, Apr 29, 2010 at 7:12 PM, Jay Soffian <jaysoffian@xxxxxxxxx> wrote: >> Let's say you've got a repo with ~ 40K files and 35K commits. >> Well-packed .git is about 800MB. >> >> You want to find out how many lines of code a particular group of >> individuals has contributed to HEAD. >>[...] >> Am I missing a clever solution? > > How often do you need to do this? If it's just once in your life, > then the brute force solution of just letting 'git blame' grind > through it for a few hours is probably the cleverest :) Yeah, I ended up doing this basically. Setup a .mailmap mapping the authors I was interested in to domain.com. Then: $ git log --pretty='%H %aE' HEAD | grep domain.com | awk '{print $1}' | git log --no-walk --stdin --name-only --pretty=%n | grep -v '^$' | sort -u > files1 $ git ls-files | sort > files2 $ comm -12 files1 files2 > files $ xargs < files -n1 git annotate | grep domain.com I didn't use --author=domain.com w/the first log invocation because I wasn't sure if it respected .mailmap and was too lazy to look it up. I probably I could've used --diff-filter in the second log invocation, but, meh. So that worked. Took about 12 minutes to run on a recent Macbook Pro. Aside, blame's --porcelain switch is rather poorly documented and annotate seemed to have the right output for the job. j. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html