On Thu, Oct 14 2021, Alexandr Miloslavskiy wrote: > On 14.10.2021 4:19, Ævar Arnfjörð Bjarmason wrote: >> I'd be interested in a copy of it, I've been slowly trying to improve >> these sorts of corruption cases. > > Sent. Thanks, I can't promise I'll take a look at it in detail time soon, but I was going to loop back to checking out these corruption cases at some point. >> I wonder if this and other issues you encountered wouldn't need a full >> "fsck", but merely gc triggering a complete repack. > > That sounds slow :( For example, it's going to be a lot of disk write > bandwidth. Just doing the verification along with regular gc sounds > faster. Having looked at your repo the immedite issue is that you've got a tree in it that has a (manually crafted?) entry that points to a commit object, without the relevant mode being correct: $ git cat-file -p 1d571d7354f99b726bbcc0cb232b3f47846c71a1 100644 blob 0189425cc210555c36383293c468df5da73acc48 common.mak 040000 tree 6a2c4a5ef0b0ee7aa85d88c3147b7558a6a7c29f include Was this created with git itself, or some tool that's manually crafting trees? I.e. that the object on-disk has the exact expected content but is just bad in this particular way points to corruption in git or another tool writing the data, not e.g. FS corruption or a bit-flip. Anyway, getting back on track the "gc" command actually does do exactly what you're suggesting: $ git gc; echo $? error: object 0189425cc210555c36383293c468df5da73acc48 is a commit, not a blob fatal: entry 'common.mak' in tree 1d571d7354f99b726bbcc0cb232b3f47846c71a1 has blob mode, but is not a blob fatal: failed to run repack 128 The problem is that as a user you won't have seen that because we won't get that far without running into the gc.auto limit, then it would have run into that, and you'd have had the contents of gc.log spewed at you by other commands. So maybe we should be more aggressive there, e.g. as a function of repo size or whatever (this repo is 18MB). You really don't need "git fsck" to verify FS corruption or basic object graph issues like these, and I think it's rather unfortunate that we expose it like that. What it does over and beyond a full repack of the repo is to exhaustively verify object contents, which is most useful e.g. if you're running a git service and want to prevent users from pushing crafted corrupt objects, either intentionally or unintentionally. I've also been meaning to look at that aspect of it for a while, i.e. it should be able to have some --fast, it has --connectivity-only, but that one goes a bit too far, although in this case it would have helped you. >> Yes, we still definitely have cases where dealing with this sort of >> thing can be very painful. > > With the new remote promisor code, I think that auto-fixing corrupted > blobs is easy enough (provided they can be found on any remote) ? Hypothetically, but these blobs aren't corrupted, and no amount of fetching something from other places is going to fix a bad DAG. If that thing didn't really point at the wrong object type the hash would be different. The problem is that it was wrong when it was written. I say "hypothetically" because even in the case of a bitflip or whatever coercing git into some sort of auto-repair mode is pretty far off. I've been able to do it manually in some cases, but e.g. promisors not having a blob *and knowing that* is very different from the more general cases of object(s) XYZ being corrupt. We have well-intentioned features like the collision detection that actively gets in the way of some repairs like that (I had a patch[1] to disable it, which as a side-effect made recovering from some forms of corruption easier). But even without that you'll find that e.g. if a recent object is bad, and we'd like to fetch it from upstream, that we're just going to die pretty early, as none of the code involved in say incremental fetching is prepared to run across a bad/corrupt object. Those aren't inherent problems, and it would be very nice to have more such auto-repair in git, just limitations of the current implementation. 1. https://lore.kernel.org/git/20181028225023.26427-5-avarab@xxxxxxxxx/