On 5/25/22 6:08 AM, Xiaoguang Wang wrote: > hello, > > I raised this issue last year and have had some discussions with Pavel, but > didn't come to an agreement and didn't come up with better solution. You > can see my initial patch and discussions in below mail: > https://lore.kernel.org/all/20211012084811.29714-1-xiaoguang.wang@xxxxxxxxxxxxxxxxx/T/ > > The most biggest issue with file registration feature is that it needs > user space apps to maintain free slot info about io_uring's fixed file > table, which really is a burden. Now I see io_uring starts to return > file slot from kernel by using IORING_FILE_INDEX_ALLOC flag in accept > or open operations, but they need app to uses direct accept or direct > open, which is not convenient. As far as I know, some apps are not > prepared to use direct accept or open: > 1) App uses one io_uring instance to accept one connection, but > later it will route this new connection to another io_uring instance > to complete read/write, which achieves load balance. In this case, > direct accept won't work. We still need a valid fd, then another > io_uring instance can register it again. This one very well could work. We already have MSG_RING for sending a message from one ring to the next, that could definitely be used to pass a direct descriptor as well and drop (or not) it from the source ring. > 2) After getting a new connection, if later apps wants to call > fcntl(2) or setsockopt or similar on it, we will need a true fd, not > a flle slot in io_uring's file table, unless we can make io_uring > support all existing syscalls which use fd. That is definitely a problem, and actually the reason why eg IORING_OP_SOCKET now exists. Seems the best solution there is pretty simple - wire up fcntl() and setsockopt(). The latter is actually trivial now that we have file_operations->uring_cmd(). > So we may still need to make io_uring file registration feature easier > to use. I'd like io_uring in kernel returns prepared file slot. For > example, for IORING_OP_FILES_UPDATE, we support user passes one fd and > returns found free slot in cqe->res, just like what > IORING_FILE_INDEX_ALLOC does. > > This is my current rough idea, any more thoughts? Thanks. I'm all for making it easier to use, but avoiding the "normal" file table is preferable for a lot of reasons. Proof is in the pudding, feel free to send an actual patch we can discuss. -- Jens Axboe