Johannes Schindelin <johannes.schindelin@xxxxxx> writes: > On 2015-06-21 19:15, Junio C Hamano wrote: > Michael Haggerty <mhagger@xxxxxxxxxxxx> writes: >> That's brilliant. >> >> Just to make sure I am reading you correctly, you mean the current >> overall structure: >> >> [...] > > The way I read Michael's mail, he actually meant something different: > if all of the blob-related errors/warnings are switched to "ignore", > simply skip unpacking the blobs. That is how I read his mail, too. But because IIRC we do not check anything special with blob other than we can read it correctly, my description of "overall structure" stayed at a very high conceptual level. The unpacking may happen at a much higher level in the code, i.e. it comes way before this part of the logic flow: if ("is bad_blob ignored?") ; else if (! "is the blob loadable and well-formed?") { in which case "is bad blobs ignored?" check may have to happen before we unpack the object. And I do not suggest introducing yet another BAD_BLOB error class; I would have guessed that you already have an error class for objects that are not stored correctly (be it truncated loose object, checksum mismatch in the packed base object, or corrupt delta in pack). It so happens that blob is the only type of object that does not have outgoing links that is needed for connectivity check, so even if you allow to ignore "error class for objects that are not stored correctly", you would still have to read trees, commits and tags; it would be a natural consequence of ignoring that class of errors that you would get a quick-and-dirty fsck by not unpacking blobs. Of course, that assumes that you can tell an object is a blob without unpacking. If a tree entry mentions an object to be a blob by having 100644 as its mode, unless you unpack the object pointed at by that tree entry to make sure it is a blob, you wouldn't be able to detect a case where a non-blob object is stored with 100644 mode, which would be an error in the containing tree object that we may want to detect. I am not sure if "skipping inflation of blobs, but still ensure connectivity and tree integrity" is really a viable mode of quick-and-dirty operation. I would imagine you would need to lose a bit more than "we don't bother reading blobs" (which is OK by me, but I am just pointing out that (1) I do not mean to say we should add BAD_BLOB as a new class, and (2) the automatic bypass Michael's --quick skips may not be limited to suppressing "we cannot read this blob object" class, but also need to suppress checks for some form of tree integrity violation). Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in