Re: [PATCH] pack-objects: lazily set up "struct rev_info", don't leak

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 25 Mar 2022 11:53:05 -0700

Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:

> In the preceding [1] (pack-objects: move revs out of
> get_object_list(), 2022-03-22) the "repo_init_revisions()" was moved
> to cmd_pack_objects() so that it unconditionally took place for all
> invocations of "git pack-objects".
>
> We'd thus start leaking memory, which is easily reproduced in
> e.g. git.git by feeding e83c5163316 (Initial revision of "git", the
> information manager from hell, 2005-04-07) to "git pack-objects";
> ...
> Narrowly fixing that commit would have been easy, just add call
> repo_init_revisions() right before get_object_list(), which is
> effectively what was done before that commit.
>
> But an unstated constraint when setting it up early is that it was
> needed for the subsequent [2] (pack-objects: parse --filter directly
> into revs.filter, 2022-03-22), i.e. we might have a --filter
> command-line option, and need to either have the "struct rev_info"
> setup when we encounter that option, or later.
>
> Let's just change the control flow so that we'll instead set up the
> "struct rev_info" only when we need it. Doing so leads to a bit more
> verbosity, but it's a lot clearer what we're doing and why.

Is this about "we take it as given that the use of rev_info leaks
until we fix revisions API, so let's keep its use limited to avoid
unnecessary leaks"?

If so, it sort-of makes sense, but smells like a roundabout way to
address the issue.  An obvious alternative is to wait until both the
topic and the "plug revision API" topic graduate and then add a
"release" call to release the resource in the same sope as the
unconditional call to init_revisions at the end.  I do not quite get
what on-demand lazy set-up buys us.  What we need to lazily set-up,
when we do lazily set-up, needs to be released either way, no?