On Thu, Jun 10 2021, brian m. carlson wrote: > [[PGP Signed Part:Undecided]] > On 2021-06-09 at 15:44:59, Ævar Arnfjörð Bjarmason wrote: >> >> On Wed, Jun 09 2021, Derrick Stolee via GitGitGadget wrote: >> >> > Updates in v2 >> > ============= >> > >> > * Some edits were removed because they were in contrib/ or >> > Documentation/howto/ and these are now listed as exclusions in the >> > message of Patch 4. >> >> Thanks. >> >> > * Several recommendations to improve the edits in the documentation and >> > code comments were incorporated. Those who recommended these edits are >> > credited with "Helped-by" tags. >> >> I think a v2 is a bit premature with all the active discussion on the v1 >> thread, a lot of which isn't addressed by the v2 or this CL, e.g. many >> point I[1] and others raised. >> >> My main objection of extending this to commit messages and thus making >> e.g. non-native speakers be on their toes when contributing to the >> project is gone, so that's good. >> >> I'm still not in favor of this change because I think an active >> recommendation like "Refer to an anonymous user in a gender neutral way" >> probably isn't needed if we simply document that our preferred prose is >> to avoid the issue entirely, which is the case in most of our >> documentation. > > I agree that in many cases in technical writing that the passive voice > (or another technique) may be preferable. For example, this selection > about O_TRUNC from open(2): > > If the file already exists and is a regular file and the access mode > allows writing (i.e., is O_RDWR or O_WRONLY) it will be truncated to > length 0. If the file is a FIFO or terminal device file, the O_TRUNC > flag is ignored. Otherwise, the effect of O_TRUNC is unspecified. > > Who is truncating it? Who is ignoring it? Who is not specifying it? > In all three cases, the specific actor is unimportant or irrelevant, and > we're better off using the passive voice here than trying to enumerate > the actor. Exactly. The preferred prose in Git's documentation in this regard should be the same matter of fact prose found in C library, binutils etc. documentation. >> The below for-show patch[2] shows an alternate approach that I think is >> a better direction than this series. >> >> It shows how some of the s/he|she/they/g search-replacements you did >> could IMO be better if we simply reduced the amount of prose, e.g. in >> strbuf.h changing: >> >> passes a context pointer, which can be used by the programmer of the >> callback as she sees fit. >> >> To: >> >> passes a `void *context` to the callback `fn` > > In many cases, saying less is better, I agree. If we don't need to > refer to a human being, then we don't need to consider any pronouns for > that human being. If we communicate things more simply with fewer > words, then that's clearly better overall for everyone involved. > Nobody's reading documentation for pleasure, after all. > > I do think that the recommendation that we refer to an anonymous user in > a gender-neutral way still stands, though. Sometimes we will need to > refer to the user or another human actor and that will be the most > natural way to express the idea, so we should use gender-neutral > language to do so. > > So roughly here, I'm in favor of both approaches. When do we need or even prefer to refer to a user like that? I haven't seen an example in our documentation that's actually better off because we're talking about things as if two "people" we need to invent pronouns for are interacting. Can anyone name one that'll stand up under scrutiny, i.e. once we can look at it and see if we couldn't phrase it better by avoiding the issue this series tries to address with a regex search-replacement? The diffstat of this series is only: 12 files changed, 22 insertions(+), 15 deletions(-) I've looked at all of them and I can't see one that wouldn't be better if the relevant text was changed similarly to what I've suggested upthread. That's why I don't think this proposal is useful. If we accept this series we're going to be left with an active recommendation for a pattern that's already almost nonexistent in our documentation. Perhaps that's because we're doing it 98% right already and aren't using "he" or "she" but "they" or "their". The multiple ways you can use "they" or "their" in the English language makes that hard to grep for. A lot of our "they"'s are referring e.g. to a command-line option, or "their" referring to "their arguments", as in the argv vector of a program. The skepticism about this being needed at all isn't an opinion I hold about software documentation in general, but about software in Git's problem space specifically. Git isn't something like software to track medical records or tax filings where we can make a hard assumption that the software is dealing with data from people, and thus the software's documentation might regularly expect to need to discuss such an invented cast of characters. We just have: * You: The "user" of the software. Maybe a human being, but that's usually no more assumed than the "user" of chmod(2) being a human being. * Other users, not people, but users in the UID/GID sense of the word. Describing system-local interactions that are really two operating system users interacting in terms of assuming that they map onto two people just adds confusion. Note how e.g. chmod(2) and other such documentation rightly avoids bringing people into the matter. At most it refers to "owner" or "another user" etc. * "Other users" on the network, e.g. you make a change, it conflicts with upstream. I think in all these cases saying something like: You can add data and commit it, then push it. Once you push it you might find another person has made changes on the same branch, he/she/it/they changed the same file as you, now you've got a conflict... Is worse than: When push your branch you might get a conflict with the remote's upstream branch, if merging it results in a conflict then... In such scenarios we're talking about e.g. our local state interacting with remote network state, those are ultimately commits or other data we have to deal with in some way. It's almost never important whether that data was made by a human or some automated system. Inventing a cast of characters just makes things more confusing. I think the nearest we come to rightly discussing actual people in the context of git's documentation is things like commit envelope data (names, E-Mail addresses). Even those don't cleanly map to human beings, so our documentation probably shouldn't be implying that in its prose.