The patch titled Subject: fs/epoll: deal with wait_queue only once has been added to the -mm tree. Its filename is fs-epoll-deal-with-wait_queue-only-once.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/fs-epoll-deal-with-wait_queue-only-once.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/fs-epoll-deal-with-wait_queue-only-once.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Davidlohr Bueso <dave@xxxxxxxxxxxx> Subject: fs/epoll: deal with wait_queue only once There is no reason why we rearm the waitiqueue upon every fetch_events retry (for when events are found yet send_events() fails). If nothing else, this saves four lock operations per retry, and furthermore reduces the scope of the lock even further. Link: http://lkml.kernel.org/r/20181114182532.27981-2-dave@xxxxxxxxxxxx Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx> Cc: Jason Baron <jbaron@xxxxxxxxxx> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/eventpoll.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) --- a/fs/eventpoll.c~fs-epoll-deal-with-wait_queue-only-once +++ a/fs/eventpoll.c @@ -1749,6 +1749,7 @@ static int ep_poll(struct eventpoll *ep, { int res = 0, eavail, timed_out = 0; u64 slack = 0; + bool waiter = false; wait_queue_entry_t wait; ktime_t expires, *to = NULL; @@ -1786,6 +1787,15 @@ fetch_events: if (eavail) goto send_events; + if (!waiter) { + waiter = true; + init_waitqueue_entry(&wait, current); + + spin_lock_irq(&ep->wq.lock); + __add_wait_queue_exclusive(&ep->wq, &wait); + spin_unlock_irq(&ep->wq.lock); + } + /* * Busy poll timed out. Drop NAPI ID for now, we can add * it back in when we have moved a socket with a valid NAPI @@ -1798,10 +1808,6 @@ fetch_events: * We need to sleep here, and we will be wake up by * ep_poll_callback() when events will become available. */ - init_waitqueue_entry(&wait, current); - spin_lock_irq(&ep->wq.lock); - __add_wait_queue_exclusive(&ep->wq, &wait); - spin_unlock_irq(&ep->wq.lock); for (;;) { /* @@ -1837,10 +1843,6 @@ fetch_events: __set_current_state(TASK_RUNNING); - spin_lock_irq(&ep->wq.lock); - __remove_wait_queue(&ep->wq, &wait); - spin_unlock_irq(&ep->wq.lock); - send_events: /* * Try to transfer events to user space. In case we get 0 events and @@ -1851,6 +1853,12 @@ send_events: !(res = ep_send_events(ep, events, maxevents)) && !timed_out) goto fetch_events; + if (waiter) { + spin_lock_irq(&ep->wq.lock); + __remove_wait_queue(&ep->wq, &wait); + spin_unlock_irq(&ep->wq.lock); + } + return res; } _ Patches currently in -mm which might be from dave@xxxxxxxxxxxx are fs-epoll-remove-max_nests-argument-from-ep_call_nested.patch fs-epoll-simplify-ep_send_events_proc-ready-list-loop.patch fs-epoll-drop-ovflist-branch-prediction.patch fs-epoll-robustify-ep-mtx-held-checks.patch fs-epoll-reduce-the-scope-of-wq-lock-in-epoll_wait.patch fs-epoll-reduce-the-scope-of-wq-lock-in-epoll_wait-fix.patch fs-epoll-avoid-barrier-after-an-epoll_wait2-timeout.patch fs-epoll-avoid-barrier-after-an-epoll_wait2-timeout-fix.patch fs-epoll-rename-check_events-label-to-send_events.patch fs-epoll-deal-with-wait_queue-only-once.patch