Re: [PATCH v3] http API: fix dangling pointer issue noted by GCC 12.0

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Fri, 15 Apr 2022 15:30:42 +0200

On Thu, Apr 14 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>
>> Having spelunked in the GCC docs, source, commits involved & in their
>> bugtracker I don't think they'd consider this broken. It's working as
>> designed.
>>
>> Now, of course as with any new compiler warning you might argue that
>> it's too overzealous, but in this case it's included it a -Wall in GCC
>> 12.0.
>
> Whatever.  I do not care if this is a broken or wai from GCC's point
> of view.
>
> I care more about the correct operation of the production code of
> ours, than a workaround to squelch a warning, overzealous or not, by
> a compiler, and if it is not clear that the workaround is obviously
> free of negative side effect, we do not want to deliberately introduce
> potential breakage in our code base just for that.
>
> And I still do not see how it is safe to unconditionally clearing
> the pointer in the slot instance to NULL.  It was asked late January
> in https://lore.kernel.org/git/xmqqv8y52g3a.fsf@gitster.g/

The v3 re-roll at
https://lore.kernel.org/git/patch-v3-1.1-69190804c67-20220325T143322Z-avarab@xxxxxxxxx/
(see range diff) was intended to address both your question there &
Taylor's at https://lore.kernel.org/git/YhprAb1f1WYIktCV@nand.local/.

> In other words, what we should have been spelunking is *not* in the
> GCC docs and code, but the http codepath in our code that depends on
> the slot not being cleared when we didn't set up the pointer in the
> current recursion of that function.  With a clear description on how
> this change is safe, with a clear understanding on why the pointer
> member "finished" was added in the first place to avoid the same
> mistake as an earlier round of this patch [*1*], it would become an
> acceptable patch, with or without GCC warning.
>
> Namely, the finding in this part of a review comment [*2*]
>
>     The only way the separation could make a difference is while
>     step_active_slots(), the current slot is completed, our local
>     "finished" gets marked as such thanks to the pointer-ness of the
>     finished member, and then another pending request is started
>     reusing the same slot object (properly initialized for that new
>     request).  In such a case, the while loop we want to see exit
>     will see that slot->in_use member is _still_ true, even though
>     it is true only because it is now about a separate and unrelated
>     request that is still waiting for completion, and the original
>     request the caller is waiting for has already finished.
>
> that was made to explain why the pointer member is there, and a
> possible case that the code before the introduction of the pointer
> would misbehave and today's code would behave better, worries me the
> most, as unconditionally assigning NULL there like this patch does
> without any guard would mean that we are breaking the code exactly
> in such a case, I would think.

I think that accurately describes a logic error in the v1 of this patch,
i.e. we can't remove the "finished" member, but per the v3 explanation I
believe (re-)setting it to NULL is safe.

> In short, I do not care who takes the credit, I care more about the
> correctness of the code than a warning by a version of a compiler, I
> do not care at all if the compiler writers considers the warning a
> bug, and I worry that the change proposed, while it may certainly
> squelch the bug, may break the code that has been working happily,
> and I haven't seen a clear explanation why it is not the case.
>
> As long as the same slot is never passed to run_active_slot()
> recursively, clearing the member unconditionally when the control
> leaves the function should not break the code.  Nobody seems to have
> explained how it is the case.

I tried to explain that in the v3, but that was part of what you
edited/amended in your applied version of it.

I don't know how to answer your concerns/questions other than as I've
already done there.