Re: Effectively tracing project contributions with git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Sep 13, 2009 at 02:10:49AM +0200, Joseph Wakeling wrote:
> 
> I don't see any solution that doesn't see me browsing diffs -- there's
> no metric that will solve the problem -- but if your stats work could
> help me get an output of the form 'here are all the diffs on file X by
> contributor Y in order of size, largest first' then I think it would
> help a LOT.

This will display all of the diffs on file (pathname) XXX by contributor YYY:

	git log -p --author=YYY XXX 

You might also find the diffstats useful:

	git log --stat --author=YYY XXX

Or if you want *only* the diffstats for the file in question, you might try:

	git log --stat --pretty=format: --author=YYY XXX | grep XXX

So the bottom line is git will allow you to extract quite a lot of
information.  You might need to do some perl- or shell- or python-
scripting to analyze or format the information, but the harder
question is determining exactly what question you want to ask.

Eliminating whitespace changes isn't hard (add the -b flag).  If you
want to eliminate variable renaming, that's harder since that requires
actually parsing the patch.  There are programs that will do that
(normally used by University professors to catch students cheating at
Programming 101 courses :-), but you'd need to do some shell (or perl
or python) scripting to splice them into the git invocations to
extract out the information.

Is there a particular reason why this is important to you?  Is it for
curiosity reasons; are you trying to build a case that you've
contacted all of the significant contributors for the purposes of
changing the license used on a file?  If it's the latter, what I'd
probably do is just simply collect everyone who has ever changed a
file (git log --format="%aN <%aE>" pathname/to/a/file | sort -u) and
try to get as many people as possible to agree to the license change.
For the ones who have _not_ agreed, or which you can not contact, you
can go back and just analyze their changes (git log --author=YYY) to
decide whether or not they are significant, and whether you need to
try extract hard to contact them, or in the worst case, find someone
to rewrite the parts of the file which they had modified in the past.

Or maybe you have some other reason for gathering said information.
Depending on what the high-level thing it is that you are trying to
do, there may be an easier or more elegant way to get the information
you are requesting.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]