Re: [PATCH 00/11] midx: clean up t5319 under 'SANITIZE=leak'

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Fri, 22 Oct 2021 10:23:26 +0200

On Fri, Oct 22 2021, Taylor Blau wrote:

> On Thu, Oct 21, 2021 at 01:50:55PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Wed, Oct 20 2021, Taylor Blau wrote:
>>
>> > I tried to separate these out based on their respective areas. The last 10% are
>> > all from leaking memory in the rev_info structure, which I punted on for now by
>> > just UNLEAK()-ing it. That's all done in the last patch. If we choose to take
>> > that last patch, then t5319 passes even under SANITIZE=leak builds. But it's a
>> > little gross, so I'm happy to leave it out if others would prefer.
>>
>> I'll defer to you, but I've got a mild preference for keeping it out. An
>> upcoming series of patches I'm submitting (I'm waiting on the current
>> leak fixes to land) will fix most of those rev_info leaks. So not having
>> UNLEAK() markings in-flight makes things a bit easier to manage.
>
> If you don't mind, I think keeping this in (as a way to verify that we
> really did make t5319 leak-free) may be good for reviewers. When you
> clean these areas up, you'll have to remember to remove those UNLEAK()s,
> but hopefully they produce conflicts with your in-flight work that serve
> as not-too-annoying reminders.
>
> I would certainly rather not have to UNLEAK() those at all, so I am very
> excited to see a series from you which handles freeing these resources
> appropriately. It was just too big a bite for me to chew off when
> preparing this quick series.

Having looked at this a bit more carefully I've upgraded that a bit from
"I'll defer to you" to strongly objecting to this specific approach (but
read to the end, I think you can have your cake and eat it too).

I just glanced at this series in my previous pass-over, and evidently
didn't read the CL to the end. I thought based on the commit subject
that it was just a change to some midx code impacting t5319.

But it's a bigger change across the whole test suite than that.

If you run:

    rm .prove; GIT_SKIP_TESTS=t0027 prove -j8 --state=save t[0-9]*.sh :: --immediate

Without this patch (t0027 will take forever), and then:

    GIT_SKIP_TESTS=t0027 prove -j8 --state=failed

You'll see that (I use GNU screen's logging feature to get the output)
we made these tests pass, not just your t5319:
    
    $ grep -w ok screenlog.11 
    t0056-git-C.sh .................................... ok                  
    t1060-object-corruption.sh ........................ ok                  
    t4212-log-corrupt.sh .............................. ok                  
    t4207-log-decoration-colors.sh .................... ok                  
    t5313-pack-bounds-checks.sh ....................... ok                  
    t5506-remote-groups.sh ............................ ok                  
    t5513-fetch-track.sh .............................. ok                  
    t5319-multi-pack-index.sh ......................... ok                  
    t5532-fetch-proxy.sh .............................. ok                  
    t5900-repo-selection.sh ........................... ok                  
    t6011-rev-list-with-bad-commit.sh ................. ok                  
    t6100-rev-list-in-order.sh ........................ ok                  
    t6114-keep-packs.sh ............................... ok                  
    t6131-pathspec-icase.sh ........................... ok                  
    t6003-rev-list-topo-order.sh ...................... ok                  
    t7702-repack-cyclic-alternate.sh .................. ok                  
    t9108-git-svn-glob.sh ............................. ok                  
    t9109-git-svn-multi-glob.sh ....................... ok                  
    t9127-git-svn-partial-rebuild.sh .................. ok                  
    t9133-git-svn-nested-git-repo.sh .................. ok                  
    t9168-git-svn-partially-globbed-names.sh .......... ok                  

I've got subsequent patches on top of what's now in-flight that fix
those bigger leaks incrementally, and as part of that add
"TEST_PASSES_SANITIZE_LEAK=true" to /all/ the relevant passing
tests.

I.e. there isn't a 1=1 correspondance between all current passing tests
and "TEST_PASSES_SANITIZE_LEAK=true" markings (some are still left
unmarked). But there has been a 1=1 mapping between freshly pasing tests
as a result of leak fixes and tests that get that
"TEST_PASSES_SANITIZE_LEAK=true" marking.

I think it helps a lot to review those patches & assess their impact if
code changes can be paired with relevant test suite changes, which isn't
the case if we've got simultaneous UNLEAK() markings for major leaks
in-flight.

So the practical effect of that is that I'll need to rebase all that,
change my testing setup to test those one-at-at-time with "git rebase -i
--exec" before submission (which I do anyway), but now with some
selective revert of an earlier UNLEAK() to assert that I'm still fixing
the things I expected.

Then either drop the parts of the commit messages that say "this fixes
leak XYZ, making tests A, B, C pass" to omit any mention of "A, B, C",
or reword it for the earlier now-landed UNLEAK() markings.

These are also just the fully passing tests I ran as a one-off for this
E-Mail. For those I do more exhaustive runs where I'll e.g. spot if a
test goes from N failing to N-X, and note it if that gets us much closer
to 0.

But on the "cake and eat it too" front I also realize that it sucks to
not be able to mark tests you made leak-free as passing just because
I've got some larger more general leak fixes planned, and even if none
of your code leaks, it might just be "git log" or whatever.

But we can just use LSAN_OPTIONS with a suppressions file for that. This
diff below (assume my SOB yadayada) makes your tests pass if I eject the
11/11 and replace that with this:

diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
index a3c72b68f7c..3f6d716f825 100755
--- a/t/t5319-multi-pack-index.sh
+++ b/t/t5319-multi-pack-index.sh
@@ -1,8 +1,23 @@
 #!/bin/sh
 
 test_description='multi-pack-indexes'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+test_expect_success SANITIZE_LEAK 'setup LSAN whitelisting' '
+	suppressions=".git/lsan-suppressions.txt" &&	    
+	cat >"$suppressions" <<-\EOF &&
+	leak:add_rev_cmdline
+	leak:cmd_log_init_finish
+	leak:prep_parse_options
+	leak:prepare_revision_walk
+	leak:strdup
+	EOF
+	LSAN_OPTIONS="$LSAN_OPTIONS:suppressions=\"$PWD/$suppressions\"" &&
+	export LSAN_OPTIONS
+'
+
 GIT_TEST_MULTI_PACK_INDEX=0
 objdir=.git/objects
 
I think that would be a much better approach here, and would really
prefer if we went for some version of that if you really want to test
these right away with the linux-leaks job.

I think the better thing to do here is to just leave it, i.e. we've got
tons of these tests that don't leak themselves, but leak due to "git
log" or whatever already, but anyway, if what you're doing in
t5319-multi-pack-index.sh doesn't cause much more work for me as noted
above I really don't mind.

If you want to pick that approach up and run with it I think it would
probably make sense to factor that suggested test_expect_success out
into a function in test-lib-functions.sh or whatever, and call it as
e.g.:
    
    TEST_PASSES_SANITIZE_LEAK=true
     . ./test-lib.sh
    declare_known_leaks <<-\EOF
    add_rev_cmdline
    [...]
    EOF

Then pipe it through sed 's/^/leak:/' and have it set LSAN_OPTIONS for
you.

Doing it that way would be less boilerplate for each test that wants it,
and is also more likely to work with other non-LSAN leak appoaches,
i.e. as long as something can take a list of lines matching stack traces
we can feed that to that leak checker's idea of a whitelist.