Re: Wildcards in mailmap to hide transgender people's deadnames

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 20/09/2022 12:23, Ævar Arnfjörð Bjarmason wrote:
I'm happy to resurrect my SHA-256 hashed mailmap series if we're
all willing to agree to not implement trivial decoding features.
I'd think you'd want to be really clear about what that forward promise
would entail. E.g. I've sometimes wanted a way for "git log" to report
when it munges commits due to adding notes, re-encoding the data etc. If
someone submits that sort of feature should it always explicitly leave
out mailmap-related rewrites?

And even if it does, who do we think we're really helping in the end,
given the trivial way you could get that with an external "diff" with
the one-liner above?

I think the most important thing here is that the mailmap should not allow for even-more-trivial ways to discover old names than currently already exist. I've thought more about what you said, Ævar, and now I'm wary of a mailmap implementation that would entail having my old and new information next to each other, even if encoded (doesn't matter if it's URL-encoded or base64-encoded), because I think it's likely some external data mining tool will decode the address and place them next to each other, so that if you search for the email address in a search engine you'll also see the other address. I think a hash encoding will prevent these automated miners from doing that, since reversing a hash is too much effort for an untargeted attack (right? if you disagree, how about a salted hash?).

Either way, I think any mailmap-based solution will allow the old and new name to be linked to each other by an adversary, as you showed with your neat one-liner. However, I think a (salted?) hash in the mailmap will be sufficient for casual obfuscation where harassment is unlikely, but the user wants to prevent accidental disclosure or plain linkage.

I also have an alternate proposal which I pitched to some folks at Git
Merge and which I just finished writing up that basically moves personal
names and emails out of commits, replacing them with opaque identifiers,
and using a constantly squashed mailmap commit in a special ref to store
the mapping.  This doesn't address changing identities in existing
commits, which as we've seen are nearly impossible to fix, but it does
address new ones.  I've sent it out at
https://lore.kernel.org/git/20220919145231.48245-1-sandals@xxxxxxxxxxxxxxxxxxxx/.
As I understand the difference in this scenario a hypothetical future
repo's Y commit's authorship would have been opaque in the first place
using this mechanism, and via your "refs/mailmap" you'd have mapped
Y=Bob.

You then make a future X commit, and map X=Alice, and have a .mailmap
entry which mapped Y=X, but that entry would refer to the opaque value.

That certainly changes things in a fundamental way, and goes most or all
of the way to mitigating what I've been pointing out as a flaw in these
proposals.

I'd still be very much on the fence about whether we'd ever want to
recommend that to someone concerned with "harassment" and the like (as
opposed to a milder social preference), as all it would take to get to
that point is someone having a copy of the older "refs/mailmap" to
unmask the previous "Y".

I first want to say that I really like your proposal, Brian! I didn't think this subject would get the attention it did, but I'm happy it's being picked up the way it is, and to see this lively discussion going on between yall!

And Ævar, you're right that having an older copy would allow one to discover a mapping from the old to the new name. But this will happen in any way we can conceivably implement this because the adversary can always keep an old copy of the entire repo, clone the new one, and compare the two logs. (You can probably come up with a neat one-liner, but that's besides the point ;-).) I think that the most appropriate threat model here is to assume that everyone who has accessed the repo before the name change will notice the name change and will be able to create a mapping. Instead, our goal should be to create a system that ensures that people who first access the repo after the name change are unable to find the old name at all. I think Brian's proposal achieves this. This is analogous to the real world where people who knew me before my transition will probably never (completely) forget my old name, and it's useless to try to make that happen, but at least I can prevent new people I meet from finding out the old name.

- Florine





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux