On Wed, Nov 22, 2017 at 10:42:30AM +0900, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > I'm not sure what the right behavior is, but I'm pretty sure that's not > > it. Probably one of: > > > > - skip updating the ref when we see the breakage > > > > - ditto, but terminate the whole operation, since we might be deleting > > other refs and in a broken repo we're probably best to make as few > > changes as possible > > > > - behave as if it was a non-ff, which would allow "--force" to > > overwrite the broken ref. Maybe convenient for fixing things, but > > possibly surprising (and it's not that hard to just delete the > > broken refs manually before proceeding). > > Perhaps the last one would be the ideal endgame, but the second one > may be a good stopping point in the shorter term. This turns out to be a lot trickier than I expected. The crux of the matter is that the case we care about is hidden inside lookup_commit_reference_gently(), which doesn't distinguish between corruption and "not a commit". So there are four cases we care about for this call in fetch: 1. We fed a real sha1 and got a commit (or peeled to one). 2. We fed a real sha1 which resolved to a non-commit, and we got NULL. 3. We fed a real sha1 and the object was missing or corrupted, and we got NULL. 4. We fed a null sha1 and got NULL. Right now we lump cases 2-4 together as "do not do a fast-forward check". That's fine for 2 and 4, but probably not for 3. We can easily catch case 4 ourselves (if we care to), but distinguishing case 3 from the others is hard. How should lookup_commit_reference_gently() signal it to us? Or should lookup_commit_reference_gently() die on corruption? That's not very "gentle", but I think the "gently" here is really about "it might not be a commit", not "the repo might be corrupted". But I think even that may be the tip of the iceberg. The next thing we do is feed the commits to in_merge_bases(), which will happily return "nope" if the old commit cannot be parsed (because it has only a boolean return value). So I dunno. Maybe it is a losing battle to try to pass this kind of corruption information up the stack. I'm tempted to say that there should just be a "paranoid" flag to globally die() whenever we see a corruption (and you could run with it normally, but relax it whenever you're investigating a broken repo). But I doubt even that works. Not having the "old_oid" object at all would be a repo corruption here, but how are the low-level routines supposed to know when a missing object is a corruption and when it is not? -Peff