On Tue, May 10, 2022 at 04:50:47PM -0700, Linus Torvalds wrote: > > For what it's worth, as someone who is frequently tracking down and > > reporting issues, a link to the mailing list post in the commit message > > makes it much easier to get these reports into the right hands, as the > > original posting is going to have all relevant parties in one location > > and it will usually have all the context necessary to triage the > > problem. > > Honestly, I think such a thing would be trivial to automate with > something like just a patch-id lookup, rather than a "Link:". I'm not sure that's quite reliable, and I'm speaking from experience of running git-patchwork-bot, which attempts to match commits to patches. Patch-id has these important disadvantages: 1. git-patch-id can be fragile: if the maintainer changes things like add curly braces, rename a variable, or edit a code comment for clarity, the patch-id stops matching. This happens routinely with git-patchwork-bot, and patchwork uses an even laxer algorithm than git-patch-id. In fact, I had to hack git-patchwork-bot to fall back on Link: tags to match by message-id to address some of the maintainers' complaints. 2. git-patch-id doesn't include author/date/commit message: which can actually be important for establishing provenance and attribution and can confuse automation. E.g. an author submits a patch as part of a large series, gets told to break it apart, then submits it as part of a different series. Automated processes trying to match commits to submissions won't be able to tell from which series the commit came from. Cregit folks (cregit.linuxsources.org) have encountered all of these and I know from talking to them that they are quite happy to have a way to match commit provenance to the exact messages in the archives. > Wouldn't it be cool if you had some webby interface to just go from > commit SHA1 to patch ID to a lore.kernel.org lookup of where said > patch was done? Yes, it's https://cregit.linuxsources.org/ and it's... okay. :) It certainly doesn't manage to match all commits to patches despite having access to all of lore.kernel.org archives. > My argument here really is that "find where this commit was posted" is > > (a) not generally the most interesting thing > > (b) doesn't even need that "Link:" line. > > but what *is* interesting, and where the "Link:" line is very useful, > is finding where the original problem that *caused* that patch to be > posted in the first place. I think the disconnect here is that you're approaching this from the perspective of a human being, while what many want is a dumb and reliable way to match commits to ML submissions, which will allow improving unattended automation. > So that whole "searching is often an option" is true for pretty much > _any_ Link:, but I think that for the whole "original submission" it's > so mindless and can be automated that it really doesn't add much real > value at all. Believe me, I've tried, and I really, really like having a fool-proof way to match commits directly to the exact ML submissions. :( Even a 99%-reliable fuzzy matching algorithm has enough of a failure rate that causes maintainers to get annoyed -- I have many "git-patchwork-bot missed this commit" complaints in the queue to prove this. I think we should simply disambiguate the trailer added by tooling like b4. Instead of using Link:, it can go back to using Message-Id, which is already standard with git -- it's trivial for git.kernel.org to link them to lore.kernel.org. Before: Signed-off-by: Main Tainer <main.tainer@xxxxxxxxx> Link: https://lore.kernel.org/r/CAHk-=wgAk3NEJ2PHtb0jXzCUOGytiHLq=rzjkFKfpiuH-SROgA@xxxxxxxxxxxxxx After: Signed-off-by: Main Tainer <main.tainer@xxxxxxxxx> Message-Id: <CAHk-=wgAk3NEJ2PHtb0jXzCUOGytiHLq=rzjkFKfpiuH-SROgA@xxxxxxxxxxxxxx> This would allow people to still use Link: for things like linking to actual ML discussions. I know this pollutes commits a bit, but I would argue that this is a worthwhile trade-off that allows us to improve our automation and better scale maintainers. -K