On Fri, Sep 10 2021, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > >> On Fri, Sep 10, 2021 at 03:22:47PM +0000, Gwyneth Morgan wrote: >> >>> The line for Jessica Clarke should probably just be >>> >>> Jessica Clarke <jrtc27@xxxxxxxxxx> >>> >>> That works the same and doesn't put a reference to an old name. >> >> Thanks, that's a good suggestion. I kind of wonder if these mass >> mailmap-cleanup patches are a good idea in general. They are making >> assumptions about how people want their names to be represented, and >> whether and how they want any mappings to appear. Maybe that's something >> we should be leaving to people to propose for their own identities. >> >> Of course people who aren't active in the project anymore may not bother >> to do the cleanup, and of course messy data makes me sad. But on the >> whole, I'm not sure it's that big a deal either way. > > I am not enthused by the idea of replying to this thread, knowing > that many of the CC'ed addresses will bounce X-<, but I agree with > you on all three counts. Even for those who are no longer active, > it makes sense to unify multiple idents that are spelled differently > to help "git shortlog", but which one to unify to is not something > we can decide without their input. > > Which leads me to suggest something like the attached patch. I > wrote "Please notify us" for those who are no longer active and > forgot how .mailmap entries are spelled to ask for help correcting. > > Of course, the updated instruction does not prevent a motivated > volunteer to contact the people _individually_ and then send in > a patch with entries that the volunteer secured consent, perhaps > in the form of Acked-by ;-) > > > .mailmap | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git i/.mailmap w/.mailmap > index 9c6a446bdf..20b581c879 100644 > --- i/.mailmap > +++ w/.mailmap > @@ -4,6 +4,11 @@ > # and/or not always written the same way, making contributions from the > # same person appearing not to be so. > # > +# If you find an incorrect entry that affects yourself, please notify us > +# at <git@xxxxxxxxxxxxxxx> and suggest corrections. Because the way people > +# want their names to be represented varies, please refrain from touching > +# entries for other people unless you positively know that the updated > +# entries are what they want. > > <nico@xxxxxxxxxxx> <nico@xxxxxxx> > Alejandro R. Sedeño <asedeno@xxxxxxx> <asedeno@xxxxxxx> That seems too high a burden for edits that are likely to be uncontroversial. I.e. this seems targeted at someone who's changing someone's name without their approval after the fact and when they can't be contacted. Whereas most if not all edits to this file are likely to be janitorial work such as de-duplicating E-Mail addresses, or cases where the people involved even if they can't be contacted have already shared this information unambiguously with the world, it just happens to be in the mailing list archive, not in .mailmap. E.g. I'd think these were fine, even assuming you can't contact the parties involved: * The ML history reveals someone who's clearly the same person was using N E-Mail addresses in succession, we should probably map it all to the latest one, unclutters shortlog and the like. * Ditto even for their name in some cases. Is someone's commit history 200 commits of abbreviating their middle name, with 1-2 commits at the start where they don't? Seems fine to just normalize that. * Is the E-Mail they last used bouncing their E-Mails ("this person doesn't work here anymore"), domain expired etc? It would be useful to other contributors to map that to this-is-unreachable@xxxxxxxxxxx or whatever so they won't waste time CC-ing a bad address. Which doesn't mean that there aren't things we shouldn't be doing without asking: * Changing a name from Alice to Bob? Yes, ask and have the person ack it. * Found the current E-Mail address someone who contributed years ago but not under that address to git.git? Ask them first, they may not wish to be contacted at all. etc.