On Fri, Mar 25 2022, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> In the preceding [1] (pack-objects: move revs out of >> get_object_list(), 2022-03-22) the "repo_init_revisions()" was moved >> to cmd_pack_objects() so that it unconditionally took place for all >> invocations of "git pack-objects". >> >> We'd thus start leaking memory, which is easily reproduced in >> e.g. git.git by feeding e83c5163316 (Initial revision of "git", the >> information manager from hell, 2005-04-07) to "git pack-objects"; >> ... >> Narrowly fixing that commit would have been easy, just add call >> repo_init_revisions() right before get_object_list(), which is >> effectively what was done before that commit. >> >> But an unstated constraint when setting it up early is that it was >> needed for the subsequent [2] (pack-objects: parse --filter directly >> into revs.filter, 2022-03-22), i.e. we might have a --filter >> command-line option, and need to either have the "struct rev_info" >> setup when we encounter that option, or later. >> >> Let's just change the control flow so that we'll instead set up the >> "struct rev_info" only when we need it. Doing so leads to a bit more >> verbosity, but it's a lot clearer what we're doing and why. > > Is this about "we take it as given that the use of rev_info leaks > until we fix revisions API, so let's keep its use limited to avoid > unnecessary leaks"? Not exactly,. When you use the revisions API to do "filter" stuff in this codepath it leaks both before & after Derrick's patches, so nothing has changed in that case, but... > If so, it sort-of makes sense, but smells like a roundabout way to > address the issue. An obvious alternative is to wait until both the > topic and the "plug revision API" topic graduate and then add a > "release" call to release the resource in the same sope as the > unconditional call to init_revisions at the end. I do not quite get > what on-demand lazy set-up buys us. What we need to lazily set-up, > when we do lazily set-up, needs to be released either way, no? ...We were doing lazy setup of "struct rev_info" before the parent series, and as a result it introduces a new memory leak. We do a malloc() for some diff.c code that revisions.c uses unconditionally, which then don't use at all in some common cases. The patch I've submitted on top just restores the previous state of the initialization being lazy, but in a way that has to be adapted for other code changes the series made.