Re: [patch 13/15] epoll: check ep_events_available() upon timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 2, 2020 at 10:51 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> I'll go stare at it some more.

That code is fundamentally broken in so many ways.

Look at how  ep_poll() does that ep_events_available() without
actually holding the ep->lock (or the ep->mtx) in half the cases.

End result: it works in 99.9% of all cases, but there's a race with
somebody else doing

        WRITE_ONCE(ep->ovflist, EP_UNACTIVE_PTR);
        /*
         * Quickly re-inject items left on "txlist".
         */
        list_splice(&txlist, &ep->rdllist);

when that code can see an empty rdllist and an inactive ovflist and
thus decide that there are no events available.

I think the "Quickly re-inject" comment may be because some people
knew of that race.

The signal handling is also odd and looks broken.

The "short circuit on fatal signals" means that ep_send_events() isn't
actually done on a SIGKILL, but the code also used an exclusive wait,
so nobody else will be woken up either.

Admittedly you can steal wakeups other ways, by simply not caring
about the end result, so maybe that's all just inherent in epoll
anyway. But it looks strange, and it seems pointless: the right thing
to do would seem to be simply to have a regular check for
signal_pending(), and returning -EINTR if rather than looping.

And that do { } while (0) is entirely pointless. It seems to exist in
order to use "break" instead of the goto that everything else does,
which I guess is nice, except the whole need for that comes from how
oddly the code is written.

Why doesn't this all do something like the attached instead?

NOTE! I did not bother to fix that ep_events_available() race.

                      Linus

Attachment: patch
Description: Binary data


[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux