On Fri, 13 Oct 2017, Jason Baron wrote:
The ep_poll_safewake() function is used to wakeup potentially nested epoll
file descriptors. The function uses ep_call_nested() to prevent entering
the same wake up queue more than once, and to prevent excessively deep
wakeup paths (deeper than EP_MAX_NESTS). However, this is not necessary
since we are already preventing these conditions during EPOLL_CTL_ADD. This
saves extra function calls, and avoids taking a global lock during the
ep_call_nested() calls.
This makes sense.
I have, however, left ep_call_nested() for the CONFIG_DEBUG_LOCK_ALLOC
case, since ep_call_nested() keeps track of the nesting level, and this is
required by the call to spin_lock_irqsave_nested(). It would be nice to
remove the ep_call_nested() calls for the CONFIG_DEBUG_LOCK_ALLOC case as
well, however its not clear how to simply pass the nesting level through
multiple wake_up() levels without more surgery. In any case, I don't think
CONFIG_DEBUG_LOCK_ALLOC is generally used for production. This patch, also
apparently fixes a workload at Google that Salman Qazi reported by
completely removing the poll_safewake_ncalls->lock from wakeup paths.
I'm a bit curious about the workload (which uses lots of EPOLL_CTL_ADDs) as
I was tackling the nested epoll scaling issue with loop and readywalk lists
in mind.
Signed-off-by: Jason Baron <jbaron@xxxxxxxxxx>
Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Salman Qazi <sqazi@xxxxxxxxxx>
Acked-by: Davidlohr Bueso <dbueso@xxxxxxx>
---
fs/eventpoll.c | 47 ++++++++++++++++++-----------------------------
1 file changed, 18 insertions(+), 29 deletions(-)
Yay for getting rid of some of the callback hell.
Thanks,
Davidlohr