Re: [PATCH] epoll: avoid calling ep_call_nested() from ep_poll_safewake()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/17/2017 11:37 AM, Davidlohr Bueso wrote:
> On Fri, 13 Oct 2017, Jason Baron wrote:
> 
>> The ep_poll_safewake() function is used to wakeup potentially nested
>> epoll
>> file descriptors. The function uses ep_call_nested() to prevent entering
>> the same wake up queue more than once, and to prevent excessively deep
>> wakeup paths (deeper than EP_MAX_NESTS). However, this is not necessary
>> since we are already preventing these conditions during EPOLL_CTL_ADD.
>> This
>> saves extra function calls, and avoids taking a global lock during the
>> ep_call_nested() calls.
> 
> This makes sense.
> 
>>
>> I have, however, left ep_call_nested() for the CONFIG_DEBUG_LOCK_ALLOC
>> case, since ep_call_nested() keeps track of the nesting level, and
>> this is
>> required by the call to spin_lock_irqsave_nested(). It would be nice to
>> remove the ep_call_nested() calls for the CONFIG_DEBUG_LOCK_ALLOC case as
>> well, however its not clear how to simply pass the nesting level through
>> multiple wake_up() levels without more surgery. In any case, I don't
>> think
>> CONFIG_DEBUG_LOCK_ALLOC is generally used for production. This patch,
>> also
>> apparently fixes a workload at Google that Salman Qazi reported by
>> completely removing the poll_safewake_ncalls->lock from wakeup paths.
> 
> I'm a bit curious about the workload (which uses lots of EPOLL_CTL_ADDs) as
> I was tackling the nested epoll scaling issue with loop and readywalk lists
> in mind.
>>

I'm not sure the details of the workload - perhaps Salman can elaborate
further about it.

It would seem that the safewake would potentially be the most contended
in general in the nested case, because generally you have a few epoll
fds attached to lots of sources doing wakeups. So those sources are all
going to conflict on the safewake lock. The readywalk is used when
performing a 'nested' poll and in general this is likely going to be
called on a few epoll fds. That said, we should remove it too. I will
post a patch to remove it.

The loop lock is used during EPOLL_CTL_ADD to check for loops and deep
wakeup paths and so I would expect this to be less common, but I
wouldn't doubt there are workloads impacted by it. We can potentially, I
think remove this one too - and the global 'epmutex'. I posted some
ideas a while ago on it:

http://lkml.iu.edu/hypermail//linux/kernel/1501.1/05905.html

We can work through these ideas or others...

Thanks,

-Jason


>> Signed-off-by: Jason Baron <jbaron@xxxxxxxxxx>
>> Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Salman Qazi <sqazi@xxxxxxxxxx>
> 
> Acked-by: Davidlohr Bueso <dbueso@xxxxxxx>
> 
>> ---
>> fs/eventpoll.c | 47 ++++++++++++++++++-----------------------------
>> 1 file changed, 18 insertions(+), 29 deletions(-)
> 
> Yay for getting rid of some of the callback hell.
> 
> Thanks,
> Davidlohr



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux