Re: [PATCH 02/10] unpack_loose_header(): simplify next_out assignment

Taylor Blau <me@xxxxxxxxxxxx> · Thu, 27 Feb 2025 19:18:49 -0500

On Tue, Feb 25, 2025 at 01:29:00AM -0500, Jeff King wrote:
> When using OBJECT_INFO_ALLOW_UNKNOWN_TYPE to unpack a header that
> doesn't fit into our initial 32-byte buffer, we loop over calls
> git_inflate(), feeding it our buffer to the "next_out" pointer each
> time. As the code is written, we reset next_out after each inflate call
> (and after reading the output), ready for the next loop.
>
> This isn't wrong, but there are a few advantages to setting up
> "next_out" right before each inflate call, rather than after:
>
>   1. It drops a few duplicated lines of code.
>
>   2. It makes it obvious that we always feed a fresh buffer on each call
>      (and thus can never see Z_BUF_ERROR due to due to a lack of output
>      space).
>
>   3. After we exit the loop, we'll leave stream->next_out pointing to
>      the end of the fetched data (this is how zlib callers find out how
>      much data is in the buffer). This doesn't matter in practice, since
>      nobody looks at it again. But it's probably the least-surprising
>      thing to do, as it matches how next_out is left when the whole
>      thing fits in the initial 32-byte buffer (and we don't enter the
>      loop at all).

Thanks for calling (3) out. There is definitely a subtle behavior change
there, but having the context that it doesn't matter in practice is
useful in judging whether or not the refactoring is OK.

In any event, I agree that this is a much cleaner read, and recall
thinking something along the same lines as (1) when we were working on
this together.

Thanks,
Taylor