Hi Junio, Peff, On Fri, Oct 27, 2023 at 10:13:01PM -0400, Jeff King wrote: > On Sat, Oct 28, 2023 at 09:12:06AM +0900, Junio C Hamano wrote: > > > Grouping @gmail.com addresses do not smell all that useful, though. While I agree with you, I think that's more an exception that the rule. > > More importantly, it is not clear what "Many reports" refers to. If > > they are *not* verbatim output from "git log" family of commands, > > iow, they are produced by post-processing output from "git log" > > family of commands, then I do not quite see why %aa is useful at > > all. I might've been a bit generous with "many report", I was mostly thinking of the ones published by lwn.net, and U-Boot for example. To some extent, "git shortlog" could be considered a part of that post-processing chain. > One way you could directly use this is in shortlog, which these days > lets you group by specific formats. So: > > git shortlog -ns --group=format:%aA That's exactly what I implemented this for :-) > is potentially useful. > > I say "potentially" because it really depends on your project and its > contributors. In git.git the results are mostly either too broad > ("gmail.com" covers many unrelated people) or too narrow (I'll assume > I'm the only contributor from "peff.net"). There are a few possibly > useful ones ("microsoft.com", "gitlab.com", though even those are > misleading because email domains don't always correspond to > affiliations). I agree with your comment here, while grouping everything under "gmail.com" for example doesn't provide anything really useful we can rely on mailmap to fix that when appropriate. I think it would otherwise count as unaffiliated. I don't claim this to be foolproof, but I do think that it gives a good overall view of which companies are involved in the project for the most part. > So I don't find it useful myself, but I see how it could be in the right > circumstances. It also feels like a symmetric match to "%al", which > already exists. I do find "aa" as the identifier a little hard to > remember. I guess it's "a" for "address", though I'd have called the > whole local@domain thing an address thing that. Of course "d" for domain > would make sense, but that is already taken. If we could spell it as > %(authoremail:domain) that would remove the question. But given the > existence of "%al", I'm not too sad to see another letter allocated to > this purpose in the meantime. I chose the "a" for "address", but I'm not sold on %aa either. I just couldn't find anything better that wasn't already taken. What about "a@"? It's a bit easier to remember, being the first character of the domain-part. > Just my two cents as a shortlog --format afficionado. ;) (Of course, > shortlog itself is the ultimate "you could really just post-process log > output" example). I'm a big fan of shortlog --format (and --group) as well! Taking it a step further, it's also possible to pass in whatever mailmap you want to generate a "report". Let's say there's mapping that only makes sense for a single release something like this could be used: git -c mailmap.file=git-mailmap-v2.42 shortlog -sn --group=format:%aA > -Peff Thanks for your time. Cheers, Liam