On Wed, Jan 06 2021, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> On Sun, Jan 03 2021, brian m. carlson wrote: >> >> We just have to worry about cases where you're not all of these people >> in one project's commit metadata and/or .mailmap, and thus mailmap rules >> would match too generously: >> >> "brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> >> "brian m. carlson" <SANDALS@xxxxxxxxxxxxxxxxxxxx> >> "BRIAN M. CARLSON" <sandals@xxxxxxxxxxxxxxxxxxxx> >> "BRIAN M. CARLSON" <SANDALS@xxxxxxxxxxxxxxxxxxxx> >> >> Is that really plausible? In any case, neither of these two patches make >> reference to us already having changed this in the past in 1.6.2 and & >> there being reports on the ML about the bug & us changing it back. See >> https://lore.kernel.org/git/f182fb1700e8dea15459fd02ced2a6e5797bec99.1238458535u.git.johannes.schindelin@xxxxxx/ >> >> Maybe we should still do this, but I think for a v3 it makes sense to >> summarize that discussion etc. > > After reading the old discussion again, I am not sure if this is > worth doing. To many people, it is a promise we've made and kept > that we treat addresses including the local part case-insensitively > when summarising commits by ident strings. > > I'd really wish that this series were structured to have 5/5 early > and 3&4/5 squashed into a single final patch. Something that I only realized after I sent <87czykvg19.fsf@xxxxxxxxxxxxxxxxxxx>: Any problems .mailmap has with Turkish "dotless I" we have already with "git log --author=<name> -i". Not to say that there isn't some problem to solve here, just that if we do it's a more general issue than mailmap. As can be inferred from my upthread reply I thought that was ASCII-only, but it turns out we do set LC_CTYPE based on the user's locale, and it also applies for English-speakers. E.g. in LANG=en_US.UTF-8 "--author=ævar -i" will work. Of course that doesn't address the point of whether we should be DWYM-ing E-Mail addresses, just the sub-claim that one reason we shouldn't is because of impossible-to-solve Unicode edge cases.