On Mon, Dec 14, 2020 at 08:48:14PM -0500, Jeff King wrote: > On Sun, Dec 13, 2020 at 01:05:38AM +0000, brian m. carlson wrote: > > > Note that this is not perfect, because a user can simply look up all the > > hashed values and find out the old values. However, for projects which > > wish to adopt the feature, it can be somewhat effective to hash all > > existing mailmap entries and include some no-op entries from other > > contributors as well, so as to make this process less convenient. > > I remain unconvinced of the value of any noop entries. Ultimately it's > easy to invert a one-way hash that comes from a small known set of > inputs. And that's true whether there are extra noops or not. > > The interesting argument IMHO is that somebody has to _bother_ to invert > the hash. So it means that the old and new identities do not show up > next to each other in a file indexed by search engines, etc. That drops > the low-hanging fruit. > > And from that argument, I think the obvious question becomes: is it > worth using a real one-way function, as opposed to just obscuring the > raw bytes (which Ævar went into in more detail). I don't have a strong > opinion either way (the obvious one in favor is that it's less expensive > to do so; and something like "git log" will have to either compute a lot > of these hashes, or cache the hash computations internally). > > I think somebody also mentioned that there's value in the social > signaling here, and I agree with that. But that is true even for a > reversible encoding, I think. After re-reading what I wrote, I just wanted to make clear: overall the feature makes sense to me. I am questioning only the argument for it, and whether a one-way hash is the right tradeoff there. -Peff