Re: [PATCH v2 00/10] improve reporting of unexpected objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 28, 2021 at 04:13:30AM +0200, Ævar Arnfjörð Bjarmason wrote:

> Ævar Arnfjörð Bjarmason (10):
>   object.c: stop supporting len == -1 in type_from_string_gently()
>   object.c: refactor type_from_string_gently()
>   object.c: make type_from_string() return "enum object_type"
>   object-file.c: make oid_object_info() return "enum object_type"
>   object-name.c: make dependency on object_type order more obvious
>   tree.c: fix misindentation in parse_tree_gently()
>   object.c: add a utility function for "expected type X, got Y"
>   object.c: add and use oid_is_type_or_die_msg() function
>   object tests: add test for unexpected objects in tags
>   tag: don't misreport type of tagged objects in errors

I'm somewhat skeptical of the final patch, given my comments (just now)
in:

  https://lore.kernel.org/git/YGBHH7sAVsPpVKWd@xxxxxxxxxxxxxxxxxxxxxxx/

I'll quote them here:

> Because when we call, say, lookup_blob() and find that the object is
> already in memory as a non-blob, we don't know who the culprit is.
> Perhaps an earlier part of the code called parse_object(), found that it
> really is a blob on disk, and used that type. But it may equally have
> been the case that we saw a reference to the object as a commit, called
> lookup_commit() on it, and now our lookup_blob() call is unhappy,
> thinking it is really a commit. In that case, one of those references is
> wrong, but we don't know which.
>
> I think a robust solution would be one of:
>
>   - make the message more precise: "saw object X as a commit, but
>     previously it was referred to as a blob". Or vice versa.
>
>   - when we see such a mismatch, go to the object database to say "aha,
>     on disk it is really a blob". That's expensive, but this is an error
>     case, so we can afford to be slow. But it does produce unsatisfying
>     results when it was the earlier lookup_commit() call that was wrong.
>     Because we have to say "object X is really a blob, but some object
>     earlier referred to it as a commit. No idea who did that, though!".

Looking at the final patch, I think you side-step the issue to some
degree because it is only touching the parse_object() code paths, where
we really have looked at the bytes in the object database. So it
basically is doing the second thing above (which is "free" because we
were accessing the odb anyway).

But I think it still has the "oops, somebody made a wrong reference much
earlier" problem. The actual bug is in some other object entirely, whose
identity is long forgotten. I think we would be much better off to say
something like "somebody expected X to be a commit, but now somebody
else expects it to be a blob", which is all that we can reliably say.
And the next step really ought to be running "git fsck" to figure out
what is going on (and we should perhaps even say so via advise()).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux