The patch titled Subject: epoll: check ep_events_available() upon timeout has been added to the -mm tree. Its filename is epoll-check-ep_events_available-upon-timeout.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/epoll-check-ep_events_available-upon-timeout.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/epoll-check-ep_events_available-upon-timeout.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Soheil Hassas Yeganeh <soheil@xxxxxxxxxx> Subject: epoll: check ep_events_available() upon timeout After abc610e01c66 ("fs/epoll: avoid barrier after an epoll_wait(2) timeout"), we break out of the ep_poll loop upon timeout, without checking whether there is any new events available. Prior to that patch-series we always called ep_events_available() after exiting the loop. This can cause races and missed wakeups. For example, consider the following scenario reported by Guantao Liu: Suppose we have an eventfd added using EPOLLET to an epollfd. Thread 1: Sleeps for just below 5ms and then writes to an eventfd. Thread 2: Calls epoll_wait with a timeout of 5 ms. If it sees an event of the eventfd, it will write back on that fd. Thread 3: Calls epoll_wait with a negative timeout. Prior to abc610e01c66, it is guaranteed that Thread 3 will wake up either by Thread 1 or Thread 2. After abc610e01c66, Thread 3 can be blocked indefinitely if Thread 2 sees a timeout right before the write to the eventfd by Thread 1. Thread 2 will be woken up from schedule_hrtimeout_range and, with evail 0, it will not call ep_send_events(). To fix this issue, while holding the lock, try to remove the thread that timed out the wait queue and check whether it was woken up or not. Link: https://lkml.kernel.org/r/20201028180202.952079-1-soheil.kdev@xxxxxxxxx Fixes: abc610e01c66 ("fs/epoll: avoid barrier after an epoll_wait(2) timeout") Signed-off-by: Soheil Hassas Yeganeh <soheil@xxxxxxxxxx> Reported-by: Guantao Liu <guantaol@xxxxxxxxxx> Tested-by: Guantao Liu <guantaol@xxxxxxxxxx> Reviewed-by: Eric Dumazet <edumazet@xxxxxxxxxx> Acked-by: Willem de Bruijn <willemb@xxxxxxxxxx> Reviewed-by: Khazhismel Kumykov <khazhy@xxxxxxxxxx> Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/eventpoll.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) --- a/fs/eventpoll.c~epoll-check-ep_events_available-upon-timeout +++ a/fs/eventpoll.c @@ -1907,7 +1907,21 @@ fetch_events: if (!schedule_hrtimeout_range(to, slack, HRTIMER_MODE_ABS)) { timed_out = 1; - break; + __set_current_state(TASK_RUNNING); + /* + * Acquire the lock and try to remove this thread from + * the wait queue. If this thread is not on the wait + * queue, it has woken up after its timeout ended + * before it could re-acquire the lock. In that case, + * try to harvest some events. + */ + write_lock_irq(&ep->lock); + if (!list_empty(&wait.entry)) + __remove_wait_queue(&ep->wq, &wait); + else + eavail = 1; + write_unlock_irq(&ep->lock); + goto send_events; } /* We were woken up, thus go and try to harvest some events */ _ Patches currently in -mm which might be from soheil@xxxxxxxxxx are epoll-check-ep_events_available-upon-timeout.patch epoll-add-a-selftest-for-epoll-timeout-race.patch