Re: [RFC] Use cases for 'git statistics'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 12 May 2008, Sverre Rabbelier wrote:
> [Sorry, I hit 'send' instead of 'save']

And now you apparently forgot to add git mailing list to receipients...

> On Mon, May 12, 2008 at 2:40 PM, Jakub Narebski <jnareb@xxxxxxxxx> wrote:
>>  This is, IMHO, the most complex example (at least to do properly).
>>  It begins with: does given author have code touching given subsystem
>>  (i.e. is it for him/her new contribution wrt. subsystem)? How many
>>  commits he/she has affecting given subsystem? How often he/she rewrites
>>  code? How many bugs were introduced?
> 
> Ah, there is a lot more to this example than I thought. Perhaps this
> data could all be shown and then, using some "importance" metric per
> item a "grade" can be calculated?

Weighting different statistics, bayesian hypotesis/filtering, expert
system, machine learning... I guess that would be quite a work to do
it well.  Probably would require to calculate and adjust scoring of code
(difficulity) and authors (skill), and matching them...

This is certainly in the "wishlist" scope.

>>  Details I think need to be provided by maintainer...
> 
> Do you mean Junio, or the user of the program?

I mean that all I can provide is speculation.  I'm not, and never was
a maintainer of OSS project, and I don't know what criteria one use
(perhaps unvoiced criteria) to decide whether given patch needs to be
examined more closely, or the cursory browsing should be enough.

>>>>  * Contributor: what happened with my code?
>>>
>>> Do you mean a "track my code" like feature? Showing the movement of a
>>> particular piece of code through the code? (Displaying information
>>> like "moved from foo.c to bar.c in commit 0123456789abcd"?)
>>
>>  I was thinking there about "git blame --reverse".
> 
> Do you mean, filter it's output for a specific user?

I mean, given the code at given version, what happened to this code?
Filtering "git blame --reverse" by user might be one way of solving it.

>>>>  * Searching where to contribute: what are oldest part of code dealing
>>>>   with error messages (find ancient code)?
>>>>
>> Or find the lines with oldest modification stamp with "die" or "warn",
>> or find which messages are oldest, even if wrapper have changed.
> 
> In that case, perhaps a regexp would be more suitable, to allow the
> user to search for any specific line, not just "die" or "warn"?

What I had in mind here, but didn't explain clear enough, was an
extension to pickaxe search.  You want to find when current error
message was created, even if the way of handling it (fprintf vs. die)
changed, or if code was indented, or was moved.

Or find all error messages, in the order they were created, for example
in git case to find ancient error messages and replace it by something
more user-friendly (or less selective about choosing friends ;-).

>>  P.S. I wonder how hard to be to plug-in such SCM statistic system
>>  into something like project management, see
>>   "Joel On Software: Evidence based scheduling" (of programming tasks)
>>   http://www.joelonsoftware.com/items/2007/10/26.html
> 
> Interesting article, I think integrating statistics
> (http://www.statsvn.org/ for example) can be a very powerful tool for
> project management.

You meant http://git.koha.org/gitstat/, didn't you? ;-P

Siriously, what I had in mind was to integrate author dates and commit
dates into project management system scheduling.

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux