Re: [PATCH v2 10/10] tag: don't misreport type of tagged objects in errors

Jeff King <peff@xxxxxxxx> · Thu, 1 Apr 2021 03:54:56 -0400

On Wed, Mar 31, 2021 at 10:46:22PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Neither of those types is the correct one. And the segfault is just a
> > bonus! :)
> >
> > I'd expect similar cases with parsing commit parents and tree pointers.
> > And probably tree entries whose modes are wrong.
> 
> So the segfault happens without my patches,

Yeah, sorry if that was unclear. It is definitely a pre-existing bug.

> but the change is that
> before we'd always get it wrong and say "commit, not a tree", but now
> we'll get it right some of the time. Patching the relevant object.c code
> to emit different messages from the various functions shows that it's
> the oid_is_type*() functions that get it right, but object_as_type() is
> wrong as before.
> 
> So that's certainly something I missed.
> 
> But are there any cases where it makes things worse? Or is it just that
> it's not a full fix in all cases, but only a partial one?

Right, I don't think your patch is making anything worse. It's just that
it does not cover all cases where we see an object as two different
types. Nor can it, since it is relying on code paths that actually parse
the object, and not all of them do.

> > My point is that we are not always coming from parse_object_buffer()
> > when we see these error messages.
> 
> If my solution of relying on the parsed v.s. non-parsed shouldn't we
> just devolve to a full object info lookup when emitting the error? It's
> more expensive, but we're emitting an error anyway...

That's certainly one option (that I suggested earlier in [0]). If we go
that route, then we do not need any of this "the caller passes in an
extra bit to say that it is parsing the object, and it found a tree",
because the error routine in object_as_type() would consult the odb
itself.

But I still think it does not make the error messages fully useful. We
might say "object X is really a tree in the odb, but we previously saw
it as a commit". But we will still have to return NULL from
lookup_tree(), so whatever containing object referenced X, _even though
it has the correct type_, will be the one to propagate the failure up
the stack. It was whoever was responsible for that "previously saw" that
is actually corrupt, and we no longer know who that was.

Which is why I wonder if it is worth even bothering to put a lot of
effort in here. If the issue is just that "X is a foo, not a bar" is
sometimes misleading, then we could solve that by simply making the
message more precise ("we saw X as a foo and a bar; one of them is
wrong"). Even if we could know _which_ is wrong with respect to what's
in the object contents, it isn't all that helpful without being able to
tell the user which object reference was the one that led us to the
wrong conclusion.

-Peff

[0] https://lore.kernel.org/git/YGBHH7sAVsPpVKWd@xxxxxxxxxxxxxxxxxxxxxxx/