On Thu, 26 Dec 2024 at 14:33, Sasha Levin <sashal@xxxxxxxxxx> wrote: > > Which means that folks should be able to use a fairly short abbreviated > commit IDs in messages, specially for commits with a long subject line. So I don't think we should take this as a way to make *shorter* IDs, just as a way to accept historical short IDs. Also, I absolutely detest how you made this be all about "short IDs". As mentioned in my very original email on this matter, the actual REAL AND PRESENT issue isn't ambiguous IDs. We don't really have them. What we *do* have is "wrong IDs". We have a ton of those. Look here, you can get a list of suspiciously wrong SHA1s with something like this: git log | egrep '[0-9a-f]{9,40} \(".*"\)' | sed 's/.*[^0-9a-f]\([0-9a-f]\{9,40\}\)[^0-9a-f].*/\1/' | sort -u > hexes which generates a list of things that look like commit IDs (ok, there's probably smarter ways) in our git logs. Now, *some* of those won't actually be commit IDs at all, they'll just be random hex numbers the above finds. But most of them will indeed be references to other commits. Then you try to find the bogus ones by doing something like cat hexes | while read i; do git show $i >& /dev/null || echo "No $i SHA"; done and you will get a lot ot hits. A *LOT*. Look, I didn't check very many of them. Mainly because it gets *so* many hits, and I get bored very easily. But I did check a handful, just to get a feel for things. And yes, some of them were random hex numbers unrelated to actual git IDs, but most were really supposed to be git IDs. Except they weren't - or at least not from the mainline tree. For example, look at commit daa07cbc9ae3 ("KVM: x86: fix L1TF's MMIO GFN calculation") which references one of those nonexistent commit IDs: Fixes: d9b47449c1a1 ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") and I have no idea where that bogus commit ID comes from. Maybe it's a rebase. Maybe it's from a stable tree. But it sure doesn't exist in mainline. What *does* exist is commit 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs"), which I found by just doing that git log --grep='kvm: x86: Set highest physical address bits in non-present/reserved SPTEs' and my point is that this is really not about "disambiguating short SHA1 IDs". Because those "ambiguous" SHA1's to a very close approximation simply DO NOT EXIST. But the completely wrong ones? They are plentiful. For a completely different example, we have ec680c1990e7 ("ide: remove BUG_ON(in_interrupt() || irqs_disabled()) from ide_unregister()") which has this in the log message: Both BUG_ON()s in ide-probe.c were introduced in commit 4015c949fb465 ("[PATCH] update ide core") and it turns out that that commit ID doesn't exist in the kernel tree. It is actually a real commit ID, and it *does* actually exist in the historical BK import by Thomas: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=4015c949fb465 so that's an example of a cross-tree ID, and yeah, it might be really cool to "disambiguate" *those* too. But again, the problem wasn't actually that the SHA1 was _short_. Linus