Re: [RFC] Use cases for 'git statistics'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 8, 2008 at 5:51 PM, Sverre Rabbelier <srabbelier@xxxxxxxxx> wrote:
> Heya,
>
> I've been busy to write up some use cases for 'git statistics' (a new
> command that I will be implementing this summer during Google Summer
> of Code). For more details on my proposal please see
> http://alturin.googlepages.com/gsoc2008 (a pdf of the use cases is
> hosted there as well for those who prefer pdf). I would like to ask
> for comments on the current use cases; is anything missing, or should
> a particular use case be removed/merged? Please let me know.
>
> Thank you for your time.
>
>
>
> == Terminology ==
>
> There are four types of users: Maintainers, Developers, Bug-fixers,
> and regular Users. The first three are all Contributors.
>
> Name: Maintainer (Contributor)
> Description: The Maintainer reviews commits and branches from other
> Contributors and decided which ones to integrate into a 'master'
> branch.
>
> Name: Developer (Contributor)
> Description: The Developer contributes enhancements to the project,
> e.g. they add new content or improve existing content.
>
> Name: Bug-fixer (Contributor)
> Description: The Bug-fixer locates 'bugs' (as something unwanted that
> needs to be corrected) in the content and 'fixes' them.
>
> Name: User
> Description: The User uses the content, be it in their daily work or
> every now and then for a specific purpose.
>
>
>
> == Use cases ==
>
>
> A model where other Contributors review commits is assumed in all use
> cases. When referenced are made to a Contributor addressing another
> Contributor to adjust their behavior as the result of data mined, it
> should be kept in mind that the Contributor should foremost be the one
> to do this. Using this information to, say, spend more time checking
> one's own commits for bugs when working on a specific part of the
> content on one's own accord is is often more effective then doing so
> only after being asked. </disclaimer>? :P
>
>
> Name: Finding a Contributor that is active in a specific bit of content.
> Description:
>        Whenever a Contributor needs to know about other Contributors that
> are active in a specific part of the content they query git for this
> information. This could be used to figure out whom to send a copy of a
> commit (someone who has recently worked on the content a commit
> modifies is likely to be interested in such a commit). This
> information may be easily gathered with, say, git blame. Aggregating
> it's output (in the background if need be to maintain speedy response
> times), it is trivial to determine whether a Contributor has more
> commits/lines of change than a predefined amount. The main difference
> with git blame is that it's output is aggregated over the history of
> the content, for a specific Contributor, whereas git blame only shows
> the latest changes.
>
>
> Name: Finding which commits touches the parts of the content that a
> commit touches.
> Description:
>        There are several reasons that one might want to know which commit
> touches the parts of the content that a commit touches. This may be
> implemented similar to how git blame works only instead of 'stopping'
> after having found the author of a line, the search continues up to a
> certain date in the past.
>
>
> Name: Integrating the found 'bug introducing' commit with the git
> commit message system.
> Description:
>        When a Bug-fixer sends out a commit to fix a bug it might be useful
> for them to find out where exactly the bug was introduced. Using the
> 'which commit touched the content this commit touches' technique
> optional candidates may be retrieved. After picking which of the found
> commits caused the bug, this information may then automatically added
> to the commit's description. This does not only allow the Bug-fixer to
> make clear the origin of their commit, but also make it possible to
> later unambiguously determine a bug/fix pair. Note that this is
> automated, no user input is required to determine which commit caused
> the bug, only the picking of 'cause' commits requires input from the
> user.
>
>
> Name: Finding the Author that introduce a lot of/almost no bugs to the content.
> Description:
>        Contributors might be interested to know which of the Developers
> introduce a lot of bugs, or the contrary, which introduce almost no
> bugs to the content. This information is highly relevant to the
> Maintainer as they may now focus the time they spend on reviewing
> commits on those that stem from Developers that turn out to often
> introduce bugs. On the other hand, Developers that usually do not
> introduce bugs need less reviewing time. While such information is
> usually known to the experienced Maintainer (as they know their main
> contributors well), it can be helpful to new maintainers, or as a
> pointer that the opinion of the Maintainer about a specific Developer
> needs to be adjusted. Bug-fixers on the other hand can use this
> information to address the Developer that introduces most of the bugs
> they fix, perhaps with advice on how to prevent future bugs from being
> introduced.
>
>
> Name: Finding the Contributor that accepted a lot of/almost no bugs
> into the content.
> Description:
>        Similar to the finding Authors that write the bugs, there are other
> Contributors that 'accept' the commit. Either passively, by not
> commenting when the commit is sent out for review, or actively, by
> 'acknowledging' (acked-by), 'signing off' (signed-off-by) or 'testing'
> (tested-by) a commit. When actively doing so, this can later be
> traced, this information can then be used in the same ways as for
> Authors.
>
>
> Name: Finding parts of the content in which a lot of bugs are
> introduced and fixed
> Description:
>        When a Developer decides to change part of the content, it would be
> interesting for them to know that many before them introduced bugs
> when working on that part of the content. Knowing this the Developer
> might ask for all such buggy commits to try and learn from the
> mistakes made by others and prevent making the same mistake. A
> Maintainer might use this information to spend extra time reviewing a
> commit from a 'bug prone' part of the content.
>
>
> Name: Finding parts of the content a particular Contributor introduces
> a lot of/almost no
> bugs to.
> Description:
>        When trying to decide whether to ask a specific Contributor to work
> on part of the content it might be useful to not only know how active
> they work on that part of the content, but also if they introduced a
> lot of bugs to that part, or perhaps fixed many. Similar to the more
> general case, this can be split out between modifying content and
> 'accepting' modifications. This information may be used to decide to
> ask a Contributor to spend more time on a specific part of the content
> before sending in a commit for review.
>
>
> Name: Finding how many bugs were introduced/fixed in a period of time
> Description:
>        As bugs are recognized by their fixes, it is always possible to match
> a bug to it's fix. Both commits have a time stamp and with those the
> time between bug and fix can be calculated. Aggregating this data over
> all known bug(fixes) the amount of unfixed bugs may be found over a
> specified period of time. For example, finding the amount of fixed
> bugs between two releases, or how many bugs were not fixed within one
> release cycle. This number might then be calculated over several time
> frames (say, each release), after which it is possible to track
> 'content quality' throughout releases. If this information is then
> graphed one can find extremes in this figure (for example, a release
> cycle in which a lot of bugs were fixed, or one that introduced many).
> Knowing this the Contributors may then determine the cause of such and
> learn from that.
>
>
> Name: Finding how much work a contributor has done over a period of time
> Description:
>        When working in a team in which everybody is expected to do
> approximately the same amount of work it is interesting to see how
> much work each Contributor actually does. This allows the team to
> discuss any extremes and attempt to handle these as to distribute the
> work more evenly.
>        When work is being done by a large group of people it is interesting
> to know the most active Contributors since these usually are the ones
> with most knowledge on the content. The other way around, it is
> possible to determine if a specific Contributor is 'active enough' for
> a specific task (such as mentoring).
>
>
> Name: Finding whether a Contributor is mostly a Developer or a Bug-fixer
> Description:
>        To all Contributors it is interesting to know if they spend most of
> their time fixing bugs, or contributing enhancements to the content.
> This information could also be queried over a specific time frame, for
> example 'weekends vs. workdays' or 'holidays vs. non-holidays'.

Heya,

I haven't had replies to this e-mail so far, did it get lost in the list noise?

-- 
Cheers,

Sverre Rabbelier
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux