Re: [PATCH 1/2] tag: factor out get_tagged_oid()

René Scharfe <l.s.r@xxxxxx> · Fri, 6 Sep 2019 17:05:11 +0200

Am 06.09.19 um 09:13 schrieb Jeff King:
> On Thu, Sep 05, 2019 at 09:55:55PM +0200, René Scharfe wrote:
>
>> Add a function for accessing the ID of the object referenced by a tag
>> safely, i.e. without causing a segfault when encountering a broken tag
>> where ->tagged is NULL.
>
> This approach seems to pretty reasonable. As somebody who's been
> thinking about this, I'd be curious to hear your thoughts on:
>
>   https://public-inbox.org/git/20190906065606.GC5122@xxxxxxxxxxxxxxxxxxxxx/
>
> which _in theory_ means tag->tagged would never be NULL (we'd catch it
> at the parsing stage and consider that an error). But we'd still
> potentially want to protect ourselves as you do here for code paths
> which don't necessarily check the parse result.

A tag referencing an unknown object sounds strange to me.  I imagine we
might get such a thing when the referenced object is lost (broken repo)
or purpose-built from an attacker.  Could such a tag still be used for
anything?  Are there other possible causes?  I suspect the answer to
both questions is "no", and then it makes sense to reject it as early
as possible.

But I may be missing something.  In particular I'm confused by these
patches from February 2008, which seem to suggest that such tags should
not be reported in all cases, but sometimes just silently ignored:

   9684afd967 revision.c: handle tag->tagged == NULL
   cc36934791 process_tag: handle tag->tagged == NULL
   24e8a3c946 deref_tag: handle tag->tagged = NULL

So is there perhaps a use case for them after all?

Leaving that aside: The parsed flag means we saw and checked the object
already.  That is true also for broken objects.  Clearing the flag can
cause the same error to be reported multiple times.  How about setting
it at the start as before, but returning -1 from parse_tag_buffer() if
.parsed == 1 && .tagged == NULL?

René