Re: [PATCH v3 00/13] epoll: support pollable epoll from userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/16/19 2:57 AM, Roman Penyaev wrote:
> Hi all,
> 
> This is v3 which introduces pollable epoll from userspace.
> 
> v3:
>   - Measurements made, represented below.
> 
>   - Fix alignment for epoll_uitem structure on all 64-bit archs except
>     x86-64. epoll_uitem should be always 16 bit, proper BUILD_BUG_ON
>     is added. (Linus)
> 
>   - Check pollflags explicitly on 0 inside work callback, and do nothing
>     if 0.
> 
> v2:
>   - No reallocations, the max number of items (thus size of the user ring)
>     is specified by the caller.
> 
>   - Interface is simplified: -ENOSPC is returned on attempt to add a new
>     epoll item if number is reached the max, nothing more.
> 
>   - Alloced pages are accounted using user->locked_vm and limited to
>     RLIMIT_MEMLOCK value.
> 
>   - EPOLLONESHOT is handled.
> 
> This series introduces pollable epoll from userspace, i.e. user creates
> epfd with a new EPOLL_USERPOLL flag, mmaps epoll descriptor, gets header
> and ring pointers and then consumes ready events from a ring, avoiding
> epoll_wait() call.  When ring is empty, user has to call epoll_wait()
> in order to wait for new events.  epoll_wait() returns -ESTALE if user
> ring has events in the ring (kind of indication, that user has to consume
> events from the user ring first, I could not invent anything better than
> returning -ESTALE).
> 
> For user header and user ring allocation I used vmalloc_user().  I found
> that it is much easy to reuse remap_vmalloc_range_partial() instead of
> dealing with page cache (like aio.c does).  What is also nice is that
> virtual address is properly aligned on SHMLBA, thus there should not be
> any d-cache aliasing problems on archs with vivt or vipt caches.

Why aren't we just adding support to io_uring for this instead? Then we
don't need yet another entirely new ring, that's is just a little
different from what we have.

I haven't looked into the details of your implementation, just curious
if there's anything that makes using io_uring a non-starter for this
purpose?

-- 
Jens Axboe




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux