Re: [RFC PATCH 03/15] reftable: don't memset() a NULL from failed malloc()

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Sat, 04 Jun 2022 18:23:13 +0200

On Sat, Jun 04 2022, René Scharfe wrote:

> Am 04.06.22 um 02:54 schrieb Ævar Arnfjörð Bjarmason:
>>
>> To your comment here & some others (e.g. FREE_AND_NULL()): I was really
>> trying to focus on narrowly addressing these -fanalyzer issues without
>> digressing into the larger topics "what is this code *really* doing, and
>> does it make sense?". It was pretty unavoidable in 13/15 though.
>>
>> Which isn't to say that I shouldn't fix some of it, e.g. your
>> s/return/BUG()/ suggestion, but I think it's best to view these patches
>> with an eye towards us already having these issues, and in most cases
>> making -fanalyzer happy is a small cost.
>>
>> And by doing so and getting a "clean build" we'll be able to turn it on
>> in CI, and thus notice when we run into new -fanalyzer issues.
>
> Future analyzer reports are likely of the same quality as the current
> ones.  If the goal is to shush them then we should just not use the
> analyzer.  If reports contain a helpful signal, e.g. pointing to a real
> bug or to overly complicated code, then we better address these issues.
>
> We can think about automating the analyzer once we have a certain number
> of commits with improvements that would not have been made without it.

We might decide not to go with -fanalyzer in CI or whatever, but I
really think that your line of reasoning here is just the wrong way to
evaluate the cost/benefit of -fanalyzer, a new warning or whatever.

There's ~15 commits in this series addressing things -fanalyzer brought
up, and it would be ~20 if the remaining issues I punted on were
addressed.

The question shouldn't be whether those things in particular were worth
the effort, but whether the added safety of getting the new diagnostic
going forward is worth the one-time cost.

Some of these commits are fixing issues going back to 2007-ish, $(git
log --no-merges --oneline -- '*.[ch]' | wc -l) is ~25k lines. And
looking at it like that 20/25K isn't that bad of a ratio :)

FWIW I spotted a couple of bugs in my own unsubmitted code from running
all of it through -fanalyzer, and that POV is also worth thinking about,
i.e. it's not just about improving git's current code, or even commits
that might land in git.git in the future.

But also to provide a development aid so that when we're writing patches
we spot issues earlier, even if they're ones we might spot before we
send the patch, or in review before it gets applied.

It's also a much faster way of spotting certain issues, if you take into
account that we've already been spotting some of these with the likes of
SANITIZE=address, valgrind runs, or coverity.

I find the warning output from -fanalyzer to be *really useful*. It's
scarily verbose at first, but it's basically doing most of the work for
you in terms of exhaustively describing how the control flow got to a
given location. With e.g. SANITIZE=address and valgrind (to the extent
that they overlap) you might get a stacktrace or two, but you generally
have to chase all that down yourself.