Re: [PATCH] leak tests: add an interface to the LSAN_OPTIONS "suppressions"

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 27 Oct 2021 10:04:17 +0200

On Tue, Oct 26 2021, Jeff King wrote:

It seems that I've gotten what I wanted from this side-thread, yay!
Thanks both, not to pile on, but just to clarify:

> On Tue, Oct 26, 2021 at 04:23:14PM -0400, Taylor Blau wrote:
>
>> So this all feels like a bug in ASan to me. I'm curious if it works on
>> your system, but in the meantime I think the best path forward is to
>> drop the last patch of my original series (the one with the three
>> UNLEAK() calls) and to avoid relying on this patch for the time being.
>
> Bugs aside, I'd much rather see UNLEAK() annotations than external ones,
> for all the reasons we introduced UNLEAK() in the first place:
>
>   - it keeps the annotations near the code. Yes, that creates conflicts
>     when the code is changed (or the leak is actually fixed), but that's
>     a feature. It keeps them from going stale.

I fully agree with you in general for "good" uses of UNLEAK(). E.g. consider:

    struct strbuf buf = xmalloc(...);
    [...]
    UNLEAK(buf);

If I was fixing leaks in that sort of code what I pointed out downthread
in [1] wouldn't apply.

So to clarify, I'm not asking in [1] that UNLEAK() not be used at all
while we have in-flight leak fixes. I.e. I'd run into a textual
conflict, but that would be trivial to resolve, and obvious what the
semantic & textual conflict was.

Rather, it's not marking specific leaks, but UNLEAK() on a container
struct that's problematic.

Depending on how they're used those structs may or may not leak. So
e.g. Taylor's upthread 11/11[2] contained:

    diff --git a/builtin/rev-list.c b/builtin/rev-list.c
    index 36cb909eba..df3811e763 100644
    --- a/builtin/rev-list.c
    +++ b/builtin/rev-list.c
    @@ -549,6 +549,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)

            argc = setup_revisions(argc, argv, &revs, &s_r_opt);

    +       UNLEAK(revs);
    +
            memset(&info, 0, sizeof(info));
            info.revs = &revs;
            if (revs.bisect)

Which if you on master replace that with:

    diff --git a/builtin/rev-list.c b/builtin/rev-list.c
    index 36cb909ebaa..3cce87e0eb7 100644
    --- a/builtin/rev-list.c
    +++ b/builtin/rev-list.c
    @@ -548,6 +548,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
                    revs.do_not_die_on_missing_tree = 1;

            argc = setup_revisions(argc, argv, &revs, &s_r_opt);
    +       BUG("using this?");

            memset(&info, 0, sizeof(info));
            info.revs = &revs;

You'll run into a failure with:

    GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0002-gitfile.sh

I.e. our existing leak tests already take that codepath and don't run
into a leak from using "revs".

So we wouldn't just be marking a known specific leak, but hiding all
leaks & non-leaks in container from the top-level, and thus hide
potential regressions, an addition to attaining the end-goal of marking
some specific test as passing.

Now, that specific example isn't very useful right now, since we don't
have a release_revisions() at all, but e.g. the first batch of fixes
I've got for the revisions.[ch] leak fix most common cases (e.g. the
pending array), but not some obscure ones.

Being able to incrementally mark those leaks as fixed on a test-by-test
basis has the advantage over UNLEAK() that we won't have regressions
while we're not at a 100% leak free (at which point we could remove the
UNLEAK()).

>   - leak-checkers only know where things are allocated, not who is
>     supposed to own them. So you're often annotating the wrong thing;
>     it's not a strdup() call which is buggy and leaking, but the
>     function five layers up the stack which was supposed to take
>     ownership and didn't.

As noted in [3] this case is because the LSAN suppressions list was in
play, so we excluded the meaningful part of the stack trace (which is
shown in that mail).

Hrm, now that I think about it I think that the cases where I had to
resort to valgrind to get around such crappy stacktraces (when that was
all I got, even without suppressions) were probably all because there
was an UNLEAK() in play...

1. https://lore.kernel.org/git/211022.86sfwtl6uj.gmgdl@xxxxxxxxxxxxxxxxxxx/
2. https://lore.kernel.org/git/f1bb8b73ffdb78995dc5653791f9a64adf216e21.1634787555.git.me@xxxxxxxxxxxx/
3. https://lore.kernel.org/git/211027.86zgquyk52.gmgdl@xxxxxxxxxxxxxxxxxxx/