Re: Potentially false-positive -Wstringop-overflow= warning with gcc >= 11.1

Richard Sandiford via Gcc-help <gcc-help@xxxxxxxxxxx> · Tue, 01 Feb 2022 05:18:57 +0000

Martin Sebor via Gcc-help <gcc-help@xxxxxxxxxxx> writes:
> On 1/28/22 09:16, Dumitru Ceara via Gcc-help wrote:
>> On 1/28/22 16:27, Segher Boessenkool wrote:
>>> On Fri, Jan 28, 2022 at 04:01:36PM +0100, Dumitru Ceara via Gcc-help wrote:
>>>>      void *l4data = p.l4_ofs != UINT16_MAX ? (char *) p.base_ + p.l4_ofs : NULL;
>>>>      struct hdr2 *h2 = l4data;
>>>>
>>>>      memcpy(h2 + 1, &somedata[0], 6);
>>>
>>> l4data can be 0, and everything false apart from there on.
>>>
>> 
>> In general, yes, l4data can be 0, if p.l4_ofs == UINT16_MAX, which can
>> only happen if pkt_bar() changed p.base_.
>> 
>> But the compiler can't know that for sure and the warning makes it sound
>> like it's a sure thing:
>> 
>> "warning: ‘memcpy’ writing 6 bytes into a region of size 0 overflows the
>> destination"
>
> The reason why it makes it sound like it's a sure thing is because
> a) the warning sees an invalid statement in the IL, and b) because
> GCC warnings work on one IL statement at a time and without regard
> to the conditions under which they might be executed.
>
> An IL statement is not always the same as a source code statement.
> Some statements are also isolated from the source code even when they
> don't appear there, or appear with different arguments.  That's also
> what happens in this case: GCC replaces the memcpy call and
> the subsequent store with two: one pair is valid  and another pair
> of invalid statements.  Here's the problem in the IL the warning
> sees and points out:
>
>    <bb 4> [count: 0]:
>    memcpy (4B, &somedata[0], 6);        <<< -Warray-bounds
>    MEM[(struct hdr2 *)0B].h21 ={v} 0;   <<< causes path isolation
>    __builtin_trap ();                       and trap to be added
>
> Before the warning is issued GCC notices the branch of the code where
> l4data is null is invalid.  It splits it into two branches: valid and
> invalid and adds a trap after the invalid sequence for good measure.
> The warning (which runs much later) then discovers the isolated code
> is invalid and points it out.  The path isolation can be disabled by
> compiling with -fno-isolate-erroneous-paths-dereference.  That also
> avoids the warning.
>
> This is an example where the invalid code isolation seems to work at
> cross purposes with the warning (or at least without coordination
> with it).  It's also one that illustrates an inconsistency in
> the handling of invalid code (undefined behavior) in GCC.  Some
> instances are eliminated (optimized away), others are replaced with
> a trap, others left in place but followed by a trap (the store at
> address zero), and others still are just left in place (the invalid
> memcpy at address 4).
>
> More to your point, it's also an example where warnings issued for
> code that's reachable only under some conditions make it seem like
> they point out unconditional bugs.
>
> All these are known problems and all of them should be tackled
> somehow.  The last one (warning for conditional code) especially
> has been a sore point because it's in everyone's face.  A number
> of ideas have been thrown out over the years, starting with simply
> rephrasing the warning to be less affirmative.  E.g., instead of
> "memcpy writing..." print:
>
>    ‘memcpy’ may write 6 bytes into a region of size 0
>
> The problem with this suggestion is that because most instances of
> these warnings occur in some conditionally executed code, and because
> all GCC warnings are meant to be taken as possible (as opposed to
> definitive) indication of bugs, changing their phrasing wouldn't
> actually solve anything.  All it would do is replace one with
> the other, without helping you understand why or when the bug might
> happen.

I agree that changing the wording doesn't solve the whole problem,
but I think it does solve something.  At the moment, we're effectively
asking each individual user to be aware of the context above, to know what
meaning is being attached to the present tense.  Making the message more
equivocal would at least suggest that the compiler doesn't “know”.

It's not the pass's “fault” that it can't tell how closely the IL
it sees matches the original code.  But it is GCC's “fault” as a whole,
not the user's, and I think that's what matters here.

> What I think is needed here is to mention the conditions
> under which the invalid statement is executed.  With that, we should
> be able to print the same context as what the static analyzer prints:
>
> warning: dereference of NULL ‘0’ [CWE-476] [-Wanalyzer-null-dereference]
>     43 |     h2->h21 = 0;
>        |     ~~~~~~~~^~~
>    ‘foo’: events 1-3
>      |
>      |   39 |  *l4data = p.l4_ofs != UINT16_MAX ? (char *) p.base_ + 
> p.l4_ofs : NULL;
>      |      | 
>        ^
>      |      | 
>        |
>      |      | 
>        (1) following ‘false’ branch...
>
>      |......
>      |   42 | py(h2 + 1, &somedata[0], 6);
>      |      |    ~~~~~~ 
>
>      |      |       |
>      |      |       (2) ...to here
>
>      |   43 | h21 = 0;
>      |      | ~~~~~~~ 
>
>      |      |     |
>      |      |     (3) dereference of NULL ‘<unknown>’

I agree this would be better, but I don't think it should hold back
tweaks to the error messages in the meantime.  (And if the tracing
was an optional feature, we'd still want wording that makes sense
when the feature is turned off.)

I realise this is rehashing an old discussion, sorry.  But it seems
like it's a discussion that gets rehashed precisely because it's
about an issue that users keep hitting.

Thanks,
Richard