Re: Whether to keep using UNLEAK() in built-ins

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:

>>> Specifically, the cited commit turned an existing strbuf_release()
>>> on &err into UNLEAK().  If that and the other strbuf (sb) were so
>>> easily releasable, why didn't we do so back then already?
>>
>> I suspect that the answer to the above question is because these
>> allocations are in the top-level cmd_commit() function, which is
>> never called recursively or repeatedly as a subroutine.  The only
>> significant thing that happens after we return from it is to exit.
>>
>> In such a code path, marking a variable as UNLEAK() is a better
>> thing to do than calling strbuf_release().  Both will work as a way
>> to squelch sanitizers from reporting a leak that does not matter,
>> but calling strbuf_release() means we'd spend extra cycles to return
>> pieces of memory to the pool, even though we know that the pool
>> itself will be cleaned immediately later at exit.
>>
>> We already have UNLEAK to tell sanitizers not to distract us from
>> spotting and plugging real leaks by reporting these apparent leaks
>> that do not matter.  It is of somewhat dubious value to do a "we
>> care too much about pleasing sanitizer and spend extra cycles at
>> runtime while real users are doing real work" change.

> Per https://lore.kernel.org/git/87a6k8daeu.fsf@xxxxxxxxxxxxxxxxxxx/ the
> real goal I have in mind here is to use the built-ins as a stand-in for
> testing whether the underlying APIs are leak-free.
>
> Because of that having to reason about UNLEAK() doing the right thing or
> not is just unneeded distraction. Yes for a "struct strbuf" it won't
> matter, but most of what we UNLEAK() is more complex stuff like "struct
> rev_info". We won't really make headway making revision.c not leak
> memory without using "git log" et al as the test subjects for whether
> that API leaks.

I have to say that you have a wrong goal and wrong priority.  The
number of UNLEAK in cmd_foo() functions is not even a poor
approximation of our progress.

Imagine that a subsystem that are repeatedly set-up and used during
a single program invocation was leaky.  Let's say a program calls
diff_setup() to prepare the diff_options structure, feeds two things
to compare, and calls diff_flush() to show the comparison result,
but let's assume this sequence leaks some resources.

Now cmd_diff() may be such a program that does a setup, feeds two
trees, calls diff_flush() and exits.  If we didn't do anything to
it, diff_options may "leak".  Marking it with UNLEAK may be a good
measure, if we want to keep the leak checker from reporting a leak
that does not matter in practice so that we can concentrate on
plugging real leaks that matter.

But consider cmd_log(), running something like "git log -p".  It
iterates over commits, does the <setup, compare two trees, flush>
repeatedly for each commit it encounters during the walk on the same
diff_options structure.  Now, the leak in the code path does matter.
If diff_flush() is left leaky, it will show up in the leak checker's
output, and that is reporting real leaks that matter.  The thing is,
the <setup, compare, flush> sequence whose leak matters is not in
cmd_log(); it is much deeper in the revision walking machinery.  We
do not want to paper over such leaks with UNLEAK.

See the difference?  The number of UNLEAK OUTside built-ins does
matter.  We do not want to have UNLEAK there to hide leaks in
possible callees that may be called arbitrary number of times in
arbitrary order.  Compared to that, UNLEAK in cmd_foo() to mark an
on-stack variable that was used only once without even a recursion
is purely to squelch the leak checker from reporting leaks that does
not matter.  For simple things like strbuf on stack of the top-level
cmd_foo() functions, we could even leave strbuf_release() not called
and leave the resource reclamation to _exit(2).  That would cause
the leak checkers to report them and distract us, and UNLEAK is a
better way to squelch the distraction without wasting extra cycles
at runtime to actually free them.

So, don't look at "built-ins as a stand-in".  It is a wrong
direction to go in to let the "leak checker" tail wag the dog.

Thanks.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux