When epfd is polled from userspace and item is being removed: 1. Mark user item as freed. If userspace has not been yet consumed ready event - route all events to kernel lists. 2. If shrink is required - route all events to kernel lists. 3. On unregistration of epoll entries do not forget to flush item worker, which can be just submitted from ep_poll_callback() Signed-off-by: Roman Penyaev <rpenyaev@xxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Davidlohr Bueso <dbueso@xxxxxxx> Cc: Jason Baron <jbaron@xxxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Cc: Andrea Parri <andrea.parri@xxxxxxxxxxxxxxxxxxxx> Cc: linux-fsdevel@xxxxxxxxxxxxxxx Cc: linux-kernel@xxxxxxxxxxxxxxx --- fs/eventpoll.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 2af849e6c7a5..7732a8029a1c 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -780,6 +780,14 @@ static void ep_unregister_pollwait(struct eventpoll *ep, struct epitem *epi) ep_remove_wait_queue(pwq); kmem_cache_free(pwq_cache, pwq); } + if (ep_polled_by_user(ep)) { + /* + * Events polled by user require offloading to a work, + * thus we have to be sure everything which was queued + * has run to a completion. + */ + flush_work(&epi->work); + } } /* call only when ep->mtx is held */ @@ -1168,6 +1176,7 @@ static bool ep_add_event_to_uring(struct epitem *epi, __poll_t pollflags) static int ep_remove(struct eventpoll *ep, struct epitem *epi) { struct file *file = epi->ffd.file; + bool events_to_klists = false; lockdep_assert_irqs_enabled(); @@ -1183,9 +1192,14 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi) rb_erase_cached(&epi->rbn, &ep->rbr); + if (ep_polled_by_user(ep)) + events_to_klists = ep_free_user_item(epi); + write_lock_irq(&ep->lock); if (ep_is_linked(epi)) list_del_init(&epi->rdllink); + if (events_to_klists) + ep_route_events_to_klists(ep); write_unlock_irq(&ep->lock); wakeup_source_unregister(ep_wakeup_source(epi)); -- 2.19.1