Re: [PATCH v2 3/3] Revert "pack-objects: lazily set up "struct rev_info", don't leak"

René Scharfe <l.s.r@xxxxxx> · Mon, 28 Nov 2022 22:57:02 +0100

Am 28.11.2022 um 19:32 schrieb Ævar Arnfjörð Bjarmason:
>
> On Mon, Nov 28 2022, René Scharfe wrote:
>
>> Am 28.11.2022 um 16:56 schrieb René Scharfe:>
>>> The problem is "How to use struct rev_info without leaks?".  No matter
>>> where you move it, the leak will be present until the TODO in
>>> release_revisions() is done.
>>
>> Wrong.  The following sequence leaks:
>>
>> 	struct rev_info revs;
>> 	repo_init_revisions(the_repository, &revs, NULL);
>> 	release_revisions(&revs);
>>
>> ... and this here doesn't:
>>
>> 	struct rev_info revs;
>> 	repo_init_revisions(the_repository, &revs, NULL);
>> 	setup_revisions(0, NULL, &revs, NULL);  // leak plugger
>> 	release_revisions(&revs);
>>
>> That's because setup_revisions() calls diff_setup_done(), which frees
>> revs->diffopt.parseopts, and release_revisions() doesn't.
>>
>> And since builtin/pack-objects.c::get_object_list() calls
>> setup_revisions(), it really frees that memory, as you claimed from the
>> start.  Sorry, I was somehow assuming that a setup function wouldn't
>> clean up.  D'oh!
>>
>> The first sequence is used in some other places. e.g. builtin/prune.c.
>> But there LeakSanitizer doesn't complain for some reason; Valgrind
>> reports the parseopts allocation as "possibly lost".
>
> Yes, some of the interactions are tricky. It's really useful to run the
> tests with GIT_TEST_PASSING_SANITIZE_LEAK=[true|check] (see t/README) to
> check these sorts of assumptions for sanity.

That may be true, and looks even useful -- I didn't know the check
value.  I only get a strange error message, though:

   $ GIT_TEST_PASSING_SANITIZE_LEAK=check ./t0001-init.sh
   Bail out! GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak

Same with make test and prove, of course.  And of course I compiled
with SANITIZE=leak beforehand.

But I don't see a connection between my comment and yours.  I was
not running any tests, just the above sequences of function calls,
e.g. in git prune.

>
>> I still think the assumption that "init_x(x); release_x(x);" doesn't
>> leak is reasonable.  Let's make it true.  How about this?  It's safe
>> in the sense that we don't risk double frees and it's close to the
>> TODO comment so we probably won't forget removing it once diff_free()
>> becomes used.
>>
>> ---
>>  revision.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/revision.c b/revision.c
>> index 439e34a7c5..6a51ef9418 100644
>> --- a/revision.c
>> +++ b/revision.c
>> @@ -3055,6 +3055,7 @@ void release_revisions(struct rev_info *revs)
>>  	release_revisions_mailmap(revs->mailmap);
>>  	free_grep_patterns(&revs->grep_filter);
>>  	/* TODO (need to handle "no_free"): diff_free(&revs->diffopt) */
>> +	FREE_AND_NULL(revs->diffopt.parseopts);
>>  	diff_free(&revs->pruning);
>>  	reflog_walk_info_release(revs->reflog_info);
>>  	release_revisions_topo_walk_info(revs->topo_walk_info);
>
> At this point I'm unclear on what & why this is needed? I.e. once we
> narrowly fix the >1 "--filter" options what still fails?

As I wrote: A call to an initialization function followed by a call to a
cleanup function and nothing else shouldn't leak.  There are examples of
repo_init_revisions()+release_revisions() without setup_revisions() or
diff_setup_done() beyond pack-objects.  I mentioned prune, but there are
more, e.g. in sequencer.c.

> But in general: I don't really think this sort of thing is worth
> it. Here we're reaching into a member of "revs->diffopt" behind its back
> rather than calling diff_free(). I think we should just focus on being
> able to do do that safely.

Sure, but the FREE_AND_NULL call is simple and safe, while diff_free()
is complicated and calling it one time too many can hurt.

> WIP patches I have in that direction, partially based on your previous
> "is_dead" suggestion:
>
> 	https://github.com/avar/git/commit/e02a15f6206
> 	https://github.com/avar/git/commit/c718f36566a

Copy-typed the interesting parts of the first patch like a medieval monk
because there doesn't seem to be a download option. :-|

> I haven't poked at that in a while, I think the only outstanding issue
> with it is that fclose() interaction.

You mean the t3702-add-edit.sh failure on Windows mentioned in the
commit message of e02a15f6206?  That's caused by the file being kept
open and thus locked during the call of the editor.  Moving the
release_revisions() call in builtin/add.c::edit_patch() before the
launch_editor() call fixes that by closing the file.

> I think for this particular thing there aren't going to be any bad
> side-effects in practice, but I also think convincing oneself of that
> basically means putting the same amount of work in as just fixing some
> of these properly.

Not to me, but perhaps that TODO is easier solved that I expected.
In any case, with the mentioned edit_patch() change described above
e02a15f6206 passes the test suite on Windows for me.

René