Re: [DISCUSSION] Growing the Git community

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

Some rhetorical top level systemy thinking...

On 19/09/2019 17:30, Derrick Stolee wrote:
During the Virtual Git Contributors' Summit, Dscho brought up the topic of
"Inclusion & Diversity". We discussed ideas for how to make the community
more welcoming to new contributors of all kinds. Let's discuss some of
the ideas we talked about, and some that have been growing since.

Feel free to pick apart all of the claims I make below. This is based
on my own experience and opinions. It should be a good baseline
for us to all arrive with valuable action items.

I have CC'd some of the people who were part of that discussion. Sorry
if I accidentally left someone out.

I. Goals and Perceived Problems

As a community, our number one goal is for Git to continue to be the best
distributed version control system.
I'm always cautious about "best" (as in "best-practice" etc). Git is only good in it's particular environment (version control in physical engineering has different problems with different solutions, which did pollute the older computer VCS systems). It certainly should be 'good'.
  At minimum, it should continue to be
the most widely-used DVCS. Towards that goal, we need to make sure Git is
the best solution for every kind of developer in every industry.
The community is wider than developers, and can include lawyers, historians, screenwriters, all with their own particular needs (e.g. the new timestamp range capability)
The
community cannot do this without including developers of all kinds. This
means having a diverse community, for all senses of the word: Diverse in
physical location, gender, professional status, age, and others.

In addition, the community must continue to grow, but members leave the
community on a regular basis for multiple reasons. New contributors must
join and mature within the community or the community will dwindle. Without
dedicating effort and attention to this, natural forces may result in the
community being represented only by contributors working at large tech
companies focused on the engineering systems of very large groups.
We should also ask why engineering companies don't have the same cycle, so as to compare and contrast the issues.

It is worth noting that this community growth must never be at the cost
of code quality. We must continue to hold all contributors to a high
standard so Git stays a stable product.

I want to "disagree" here about the accidental tone of perfection at all times and in all places.

There is a core integrity to the Git data model that validates and verifies the stored content of the versions, which should be inviolate, but beyond that, as the distance from the core increases, the "quality" can soften for both new and existing parts of the code (corner cases, quadratic and worse behaviours, design for small textural repos).

We are poor at clarifying which parts (of the code) require that top level of integrity, leading down to those parts that are simply convenience capabilities for the broad-based user. And then there is documentation, and the difficulty of understanding of Git for the general user. The shift to narrowing the core-git may further reduce the community.

Here are some problems that may exist within the Git community and may
form a barrier to new contributors entering:

1. Discovering how to contribute to Git is non-obvious.

2. Submitting to a mailing list is a new experience for most developers.
    This includes the full review and discussion process.

3. The high standards for patch quality are intimidating to new contributors.
Given that Git does support fixups and further patches in a distributed environment, we can be our own worst enemy here. Maybe we need to look carefully in the mirror.

4. Some people do not feel comfortable engaging in a community without
    a clear Code of Conduct. This discomfort is significant and based on real
    experiences throughout society.
A tricky one. It will depend on how they are used and whether they promote community and tolerance.

5. Since Git development happens in a different place than where users
     acquire the end product, some are not aware that they can contribute.

Should we also address the Windows community and ecosystem? A good neighbour to the Friendly fork? Misunderstandings about the difference between the users and the provider?... etc.

These systemy comments are about ensuring we are solving the right problems and ensuring we don't miss some issue that will negate any good work here.
--
Philip
II. Approach

The action items below match the problems listed above.

1. Improve the documentation for contributing to Git.

In preparation for this email, I talked to someone familiar with issues
around new contributors, and they sat down to try and figure out how to
contribute to Git. The first place they went was https://github.com/git/git
and looked at the README. It takes deep reading of a paragraph to see a
link to the SubmittingPatches docs.

To improve this experience, we could rewrite the README to have clearer
section markers, including one "Contributing to Git" section relatively
high in the doc. We may want to update the README for multiple reasons.
It should link to the new "My First Contribution" document
(https://git-scm.com/docs/MyFirstContribution).

2. Add more pointers to GitGitGadget

We have a reference to GitGitGadget in the GitHub PR template to try and
get people who try to submit a pull request to git/git to instead create
one on GitGitGadget. However, that captures contributors who didn't read
the docs about how to submit! (This is somewhat covered by the "My First
Contribution" doc as well, so making that more visible will also help.)

Could we reference GitGitGadget as part of the Submitting Patches doc
as well?

3. Introduce a new "mentors" mailing list

 From personal experience, all new contributors at Microsoft (after Jeff
Hostetler at least) have first had their patches reviewed privately by
the team before sending them upstream. Each time, the new contributor
gained confidence about the code and had help interpreting feedback from
the list.

We want to make this kind of experience part of the open Git community.

The idea discussed in the virtual summit was to create a new mailing
list (probably a Google group) of Git community members. The point of
the list is for a new contributor to safely say "I'm looking for a
mentor!" and the list can help pair them with a mentor. This must
include (a) who is available now? and (b) what area of the code are they
hoping to change?

As evidence that this is a good idea, please see the recent research
paper ""We Don't Do That Here": How Collaborative Editing With Mentors
Improves Engagement in Social Q&A Communities" [1].

[1] http://www.chrisparnin.me/pdf/chi18.pdf

When asking your first question on Stack Overflow, this group added
a pop-up saying "Would you like someone to help you with this?". Then,
a mentor would assist crafting the best possible question to ensure
the asker got the best response possible.

I believe this would work in our community, too. The action items
are:

a. Create the mailing list and add people to the list.

b. Add a pointer to the list in our documentation.

Note: the people on the mentoring list do not need to be
"senior" community members. In fact, someone who more recently
joined the community has a more fresh perspective on the process.

4. Add an official Code of Conduct

So far, the community has had an unofficial policy of "be nice,
as much as possible". We should add a Code of Conduct that is
more explicit about the behavior we want to model. This was also
discussed in the meeting with wide approval.

5. Advertise that Git wants new contributors

After we put items 1-4 in place, we should reach out to the
general tech community that we are interested in new
contributors. It's not enough to open the door, we should
point people to it.

This item is much less explicit about the _how_. This could
be done at the individual level: posting to social media or
blog posts. But perhaps there is something more official we
could do?

III. Measurement

How do we know if any of these items make a difference? We
need to gather data and measure the effects. With the size
of our community, I expect that it will take multiple years
to really see a measurable difference. But, no time like
the present to ask "What does success look like?"

Here are a few measurements that we could use. Each "count"
could be measured over any time frame. We could use major
releases as time buckets: v2.22.0 to v2.23.0, for example.

1. How many first-time contributors sent a patch?

2. How many contributors had their first commit accepted into
    the release?

3. How many contributors started reviewing?

4. How many total patches/reviews did the list receive?

What other measurements would be reasonable? We could try
building tools to collect these measurements for the past
to see historical trends. Based on that data, we may be
able to set goals for the future.

With such a small community, and an expected small number
of new contributors, it may also be good to do interviews
with the new contributors to ask about their experience.
In particular, we would be looking for moments where they
had trouble or experience friction. Each of those
moments is a barrier that others may not be clearing.


I look forward to the discussion.

Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux