Re: [PATCH 5/9] commit-graph: abort as soon as we see a bogus chunk

Taylor Blau <me@xxxxxxxxxxxx> · Thu, 9 Nov 2023 16:18:24 -0500

On Thu, Nov 09, 2023 at 02:17:11AM -0500, Jeff King wrote:
> The code to read commit-graph files tries to read all of the required
> chunks, but doesn't abort if we can't find one (or if it's corrupted).
> It's only at the end of reading the file that we then do some sanity
> checks for NULL entries. But it's preferable to detect the errors and
> bail immediately, for a few reasons:
>
>   1. It's less error-prone. It's easy in the reader functions to flag an
>      error but still end up setting some struct fields (an error I in
>      fact made while working on this patch series).
>
>   2. It's safer. Since verifying some chunks depends on the values of
>      other chunks, we may be depending on not-yet-verified data. I don't
>      know offhand of any case where this can cause problems, but it's
>      one less subtle thing to worry about in the reader code.
>
>   3. It prevents the user from seeing nonsense errors. If we're missing
>      an OIDL chunk, then g->num_commits will be zero. And so we may
>      complain that the size of our CDAT chunk (which should have a
>      fixed-size record for each commit) is wrong unless it's also zero.
>      But that's misleading; the problem is the missing OIDL chunk; the
>      CDAT one might be fine!
>
> So let's just check the return value from read_chunk(). This is exactly
> how the midx chunk-reading code does it.

All very well explained. I hit that same snag as you did when I was
working on the few patches I proposed we put on top of your earlier
chunk-format hardening series.

I'm glad to see this getting cleaned up, and I'm very happy with the
post-image of this patch.

Thanks,
Taylor