Measuring Community Involvement (was Re: Contributor Summit planning)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/13/2018 5:54 PM, Jeff King wrote:
So I try not to think too hard on metrics, and just use them to get a
rough view on who is active.

I've been very interested in measuring community involvement, with the knowledge that any metric is flawed and we should not ever say "this metric is how we measure the quality of a contributor". It can be helpful, though, to track some metrics and their change over time.

Here are a few measurements we can make:

1. Number of (non-merge) commit author tag-lines.

    using git repo:

  > git shortlog --no-merges --since 2017 -sne junio/next | head -n 20
   284  Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx>
   257  Jeff King <peff@xxxxxxxx>
   206  Stefan Beller <stefanbeller@xxxxxxxxx>
   192  brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx>
   159  Brandon Williams <bmwill@xxxxxxxxxx>
   149  Junio C Hamano <gitster@xxxxxxxxx>
   137  Elijah Newren <newren@xxxxxxxxx>
   116  René Scharfe <l.s.r@xxxxxx>
   112  Johannes Schindelin <Johannes.Schindelin@xxxxxx>
   105  Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
    96  Jonathan Tan <jonathantanmy@xxxxxxxxxx>
    93  SZEDER Gábor <szeder.dev@xxxxxxxxx>
    78  Derrick Stolee <dstolee@xxxxxxxxxxxxx>
    76  Martin Ågren <martin.agren@xxxxxxxxx>
    66  Michael Haggerty <mhagger@xxxxxxxxxxxx>
    61  Eric Sunshine <sunshine@xxxxxxxxxxxxxx>
    46  Christian Couder <chriscool@xxxxxxxxxxxxx>
    36  Phillip Wood <phillip.wood@xxxxxxxxxxxxx>
    35  Jonathan Nieder <jrnieder@xxxxxxxxx>
    33  Thomas Gummerer <t.gummerer@xxxxxxxxx>

2. Number of other commit tag-lines (Reviewed-By, Helped-By, Reported-By, etc.).

    Using git repo:

    $ git log --since=2018-01-01 junio/next|grep by:|grep -v Signed-off-by:|sort|uniq -c|sort -nr|head -n 20

     66     Reviewed-by: Stefan Beller <sbeller@xxxxxxxxxx>
     22     Reviewed-by: Jeff King <peff@xxxxxxxx>
     19     Reviewed-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx>
     12     Helped-by: Eric Sunshine <sunshine@xxxxxxxxxxxxxx>
     11     Helped-by: Junio C Hamano <gitster@xxxxxxxxx>
      9     Helped-by: Jeff King <peff@xxxxxxxx>
      8     Reviewed-by: Elijah Newren <newren@xxxxxxxxx>
      7     Reported-by: Ramsay Jones <ramsay@xxxxxxxxxxxxxxxxxxxx>
      7     Acked-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
      7     Acked-by: Brandon Williams <bmwill@xxxxxxxxxx>
      6     Reviewed-by: Eric Sunshine <sunshine@xxxxxxxxxxxxxx>
      6     Helped-by: Johannes Schindelin <Johannes.Schindelin@xxxxxx>
      5     Mentored-by: Christian Couder <christian.couder@xxxxxxxxx>
      5     Acked-by: Johannes Schindelin <Johannes.Schindelin@xxxxxx>
      4     Reviewed-by: Jonathan Nieder <jrnieder@xxxxxxxxx>
      4     Reviewed-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
      4     Helped-by: Stefan Beller <sbeller@xxxxxxxxxx>
      4     Helped-by: René Scharfe <l.s.r@xxxxxx>
      3     Reviewed-by: Martin Ågren <martin.agren@xxxxxxxxx>
      3     Reviewed-by: Lars Schneider <larsxschneider@xxxxxxxxx>

    (There does not appear to be enough density here to make a useful metric.)

3. Number of email messages sent.

    Using mailing list repo:

$ git shortlog --since 2017 -sne | head -n 20
  3749  Junio C Hamano <gitster@xxxxxxxxx>
  2213  Stefan Beller <sbeller@xxxxxxxxxx>
  2112  Jeff King <peff@xxxxxxxx>
  1106  Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx>
  1028  Johannes Schindelin <Johannes.Schindelin@xxxxxx>
   965  Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
   956  Brandon Williams <bmwill@xxxxxxxxxx>
   947  Eric Sunshine <sunshine@xxxxxxxxxxxxxx>
   890  Elijah Newren <newren@xxxxxxxxx>
   753  brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx>
   677  Duy Nguyen <pclouds@xxxxxxxxx>
   646  Jonathan Nieder <jrnieder@xxxxxxxxx>
   629  Derrick Stolee <stolee@xxxxxxxxx>
   545  Christian Couder <christian.couder@xxxxxxxxx>
   515  Jonathan Tan <jonathantanmy@xxxxxxxxxx>
   425  Johannes Schindelin <johannes.schindelin@xxxxxx>
   425  Martin Ågren <martin.agren@xxxxxxxxx>
   420  Jeff Hostetler <git@xxxxxxxxxxxxxxxxx>
   420  SZEDER Gábor <szeder.dev@xxxxxxxxx>
   363  Phillip Wood <phillip.wood@xxxxxxxxxxxx>

3. Number of threads started by user.

    (For this and the measurements below, I imported emails into a SQL table with columns [commit, author, date, message-id, in-reply-to, subject] and ran queries)

SELECT TOP 20
       COUNT(*) as NumSent
      ,[Author]
  FROM [git].[dbo].[mailing-list]
  WHERE [In-Reply-To] = ''
        AND CONVERT(DATETIME,[Date]) > CONVERT(DATETIME, '01-01-2018 00:00')
GROUP BY [Author]
ORDER BY NumSent DESC

| NumSent | Author                     |
|---------|----------------------------|
| 76      | Junio C Hamano             |
| 64      | Stefan Beller              |
| 54      | Philip Oakley              |
| 50      | Nguyá»…n Thái Ngọc Duy   |
| 49      | Robert P. J. Day           |
| 47      | Christian Couder           |
| 36      | Ramsay Jones               |
| 34      | Elijah Newren              |
| 34      | SZEDER Gábor              |
| 33      | Johannes Schindelin        |
| 31      | Jeff King                  |
| 30      | Ævar Arnfjörð Bjarmason |
| 24      | Jonathan Tan               |
| 22      | Alban Gruin                |
| 22      | brian m. carlson           |
| 18      | Randall S. Becker          |
| 15      | Paul-Sebastian Ungureanu   |
| 15      | Jeff Hostetler             |
| 15      | Brandon Williams           |
| 15      | Luke Diamand               |

4. Number of threads where the user participated

(This is measured by completing the transitive closure of In-Reply-To edges into a new 'BaseMessage' column.)

SELECT TOP 20
       COUNT(BaseMessage) as NumResponded
      ,Author
  FROM [git].[dbo].[mailing-list]
  WHERE [In-Reply-To] <> ''
        AND CONVERT(DATETIME,[Date]) > CONVERT(DATETIME, '01-01-2018 00:00')
GROUP BY Author
ORDER BY NumResponded DESC

| NumResponded | Author                     |
|--------------|----------------------------|
| 2084         | Junio C Hamano             |
| 1596         | Stefan Beller              |
| 1211         | Jeff King                  |
| 1120         | Johannes Schindelin        |
| 1021         | Nguyá»…n Thái Ngọc Duy   |
| 799          | Eric Sunshine              |
| 797          | Ævar Arnfjörð Bjarmason |
| 693          | Brandon Williams           |
| 654          | Duy Nguyen                 |
| 600          | Elijah Newren              |
| 593          | brian m. carlson           |
| 591          | Derrick Stolee             |
| 318          | SZEDER Gábor              |
| 299          | Jonathan Tan               |
| 286          | Christian Couder           |
| 263          | Jonathan Nieder            |
| 257          | Phillip Wood               |
| 256          | Derrick Stolee             |
| 238          | Taylor Blau                |
| 216          | Martin Ã…gren              |

(Note, some names have not been de-duplicated across multiple email addresses, but the email addresses are removed from these tables since I'm using a markdown generator that strips the emails in < >.)

If you have other ideas for fun measurements, then please let me know.

Thanks,

-Stolee





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux