On Tue, Sep 07, 2021 at 12:58:14PM +0200, Ævar Arnfjörð Bjarmason wrote: > Change the error fsck emits on invalid object types, such as: > > $ git hash-object --stdin -w -t garbage --literally </dev/null > <OID> > > >From the very ungraceful error of: > > $ git fsck > fatal: invalid object type > $ > > To: > > $ git fsck > error: hash mismatch for <OID_PATH> (expected <OID>) > error: <OID>: object corrupt or missing: <OID_PATH> > [ the rest of the fsck output here, i.e. it didn't hard die ] Great. I don't love the second error (since it doesn't really give the user any new information when read after the first) but that's fsck's fault, and not your patch's. > To do this we need to pass down the "OBJECT_INFO_ALLOW_UNKNOWN_TYPE" > flag from read_loose_object() through to parse_loose_header(). Since > the read_loose_object() function is only used in builtin/fsck.c we can > simply change it. See f6371f92104 (sha1_file: add read_loose_object() > function, 2017-01-13) for the introduction of read_loose_object(). > > Why are we complaining about a "hash mismatch" for an object of a type > we don't know about? We shouldn't. This is the bare minimal change > needed to not make fsck hard die on a repository that's been corrupted > in this manner. In subsequent commits we'll teach fsck to recognize > this particular type of corruption and emit a better error message. > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> > --- > builtin/fsck.c | 3 ++- > object-file.c | 11 ++++++++--- > object-store.h | 3 ++- > t/t1450-fsck.sh | 14 +++++++------- > 4 files changed, 19 insertions(+), 12 deletions(-) > > diff --git a/builtin/fsck.c b/builtin/fsck.c > index b42b6fe21f7..082dadd5629 100644 > --- a/builtin/fsck.c > +++ b/builtin/fsck.c > @@ -601,7 +601,8 @@ static int fsck_loose(const struct object_id *oid, const char *path, void *data) > void *contents; > int eaten; > > - if (read_loose_object(path, oid, &type, &size, &contents) < 0) { > + if (read_loose_object(path, oid, &type, &size, &contents, > + OBJECT_INFO_ALLOW_UNKNOWN_TYPE) < 0) { > errors_found |= ERROR_OBJECT; > error(_("%s: object corrupt or missing: %s"), > oid_to_hex(oid), path); > diff --git a/object-file.c b/object-file.c > index 9484c7ce2be..0e6937fad73 100644 > --- a/object-file.c > +++ b/object-file.c > @@ -2562,7 +2562,8 @@ int read_loose_object(const char *path, > const struct object_id *expected_oid, > enum object_type *type, > unsigned long *size, > - void **contents) > + void **contents, > + unsigned int oi_flags) > { > int ret = -1; > void *map = NULL; > @@ -2570,6 +2571,7 @@ int read_loose_object(const char *path, > git_zstream stream; > char hdr[MAX_HEADER_LEN]; > struct object_info oi = OBJECT_INFO_INIT; > + int allow_unknown = oi_flags & OBJECT_INFO_ALLOW_UNKNOWN_TYPE; > oi.typep = type; > oi.sizep = size; > > @@ -2592,8 +2594,11 @@ int read_loose_object(const char *path, > git_inflate_end(&stream); > goto out; > } > - if (*type < 0) > - die(_("invalid object type")); > + if (!allow_unknown && *type < 0) { > + error(_("header for %s declares an unknown type"), path); > + git_inflate_end(&stream); > + goto out; > + } Hmm. I'm not sure that I new test for this error (which may be uninteresting, in which case it is fine to skip). > > -test_expect_success 'fsck hard errors on an invalid object type' ' > +test_expect_success 'fsck error and recovery on invalid object type' ' > git init --bare garbage-type && > empty_blob=$(git -C garbage-type hash-object --stdin -w -t blob </dev/null) && > garbage_blob=$(git -C garbage-type hash-object --stdin -w -t garbage --literally </dev/null) && > - cat >err.expect <<-\EOF && > - fatal: invalid object type > - EOF > - test_must_fail git -C garbage-type fsck >out.actual 2>err.actual && > - test_cmp err.expect err.actual && > - test_must_be_empty out.actual > + test_must_fail git -C garbage-type fsck >out 2>err && > + grep -e "^error" -e "^fatal" err >errors && > + test_line_count = 2 errors && > + grep "error: hash mismatch for" err && > + grep "$garbage_blob: object corrupt or missing:" err && > + grep "dangling blob $empty_blob" out > ' Great. Thanks, Taylor