Many people, through the course of their lives, will change either a name or an email address. For this reason, we have the mailmap, to map from a user's former name or email address to their current, canonical forms. Normally, this works well as it is. However, sometimes people change a name (or an email) and want to completely cease use of the former name or email. This could be because a transgender person has transitioned, because a person has left an abusive partner or broken ties with an abusive family member, or for any other number of good and valuable reasons. In these cases, placing the former name in the .mailmap may be undesirable. For those situations, let's introduce a hashed mailmap, where the user's former name or email address can be in the form @sha256:<hash>. This obscures the former name or email. In the course of experimenting with some solutions for v2, I noticed that our mailmap support has a bunch of problems with case sensitivity. Notably, it treats local-parts of email addresses in a case-insensitive way, when the RFC specifically says that they are case sensitive, and we also treat names case insensitively, but only for ASCII characters. Both of those have been fixed here, and the commit messages explain in lurid detail why, while incompatible, this is the correct behavior. I've also added some performance numbers and explained some alternate solutions in the commit message for the final patch. That's in addition to the performance improvements I've done so that the feature is both cheaper for users and nearly invisible for non-users. That isn't quite the same as adding a perf test, which I haven't done, but I think this explains the situation quite well. If folks are still dying for a perf test, I can add one in v3. I will point out that fully hashing a mailmap isn't necessarily cheap, but how expensive it is depends on the weighting of current and former members of the project. As mentioned in the original thread, I think a hash rather than an encoding is the right choice here. It is likely that in a few iterations of hardware, all users will have accelerated SHA-256 and the cost will end up being a handful of cycles per name overall. Changes from v1: * Fix case-sensitivity problems in the mailmap. * Add documentation. * Add explanation of how to compute the value. * Add some optimizations to improve performance. * Improve commit message to discuss performance numbers and explain rationale better. brian m. carlson (5): mailmap: add a function to inspect the number of entries mailmap: switch to opaque struct t4203: add failing test for case-sensitive local-parts and names mailmap: use case-sensitive comparisons for local-parts and names mailmap: support hashed entries in mailmaps Documentation/mailmap.txt | 28 ++++++++ builtin/blame.c | 2 +- builtin/check-mailmap.c | 4 +- builtin/commit.c | 2 +- mailmap.c | 139 +++++++++++++++++++++++++++++++++----- mailmap.h | 15 ++-- pretty.c | 4 +- pretty.h | 2 +- revision.c | 2 +- revision.h | 3 +- shortlog.h | 3 +- t/t4203-mailmap.sh | 64 +++++++++++++++++- 12 files changed, 236 insertions(+), 32 deletions(-)