Hi, v2 of this RFC, now a bit more polished and split up. For a full explanation of the goal of the series, see patch #6 and #10. tldr is that our currently rsrc node handling can block freeing of resources for an indeterminite amount of time, which is very unfortunate for potentially long lived request. For example, networked workloads and using fixed files, where a previously long lived socket has the full resource tables of the entire ring pinned. That can lead to files being held open for a very long time, where they should be freed+closed instead. This series handles the resource nodes separately, so a request pins just the resources it needs, and only for the duration of that request. In doing so, it also unifies how these resources are tracked. As it stands, the current kernel duplicates state across user_bufs and buf_data, and ditto for the file_table and file_data. Not only is some of it duplicated (like the node arrays), it also needs to alloc and copy the tags that are potentially associated with the resource. With the unification, state is only in one spot for each type of resource, and tags are handled at registration time rather than needing to be retained for the duration of the resource. As with cleaning up of structures, it also shrinks io_ring_ctx by 64b (should be more, it adds holes too in spots), and the actual resource node goes from needing 48b and 16b of put info, to 32b. Tests out well here, both liburing test suite but also application testing. Most notably, the infamous test case that held all 10k sockets open during new opens and updates now only has the few open you'd expect. And it removes a net of about 280 lines of core io_uring code. In my opinion, it's also easier to follow. Can also be found here: https://git.kernel.dk/cgit/linux/log/?h=io_uring-rsrc Changes since v1: - Rebase on -rc5 + pending io_uring work. Each step now works, as it should, and passes testing. - Add and use node lookup helper consistently - Add a few patches killing mostly useless helpers - Split out patches from the main bigger patches - Fix an assumption in test/rsrc_tags.c that prevented it from working with the per-node refs - Add NOP patch that enables testing of both registered files and buffers - Remove 'index' struct io_rsrc_node, it was unused now. That shrinks the node size fo 32b, fitting two in a cacheline. include/linux/io_uring_types.h | 25 +- include/uapi/linux/io_uring.h | 3 + io_uring/cancel.c | 8 +- io_uring/fdinfo.c | 14 +- io_uring/filetable.c | 79 ++--- io_uring/filetable.h | 37 +-- io_uring/io_uring.c | 51 +-- io_uring/msg_ring.c | 33 +- io_uring/net.c | 16 +- io_uring/nop.c | 47 ++- io_uring/notif.c | 3 +- io_uring/opdef.c | 2 + io_uring/register.c | 3 +- io_uring/rsrc.c | 587 +++++++++++---------------------- io_uring/rsrc.h | 96 +++--- io_uring/rw.c | 15 +- io_uring/splice.c | 42 ++- io_uring/splice.h | 1 + io_uring/uring_cmd.c | 20 +- 19 files changed, 437 insertions(+), 645 deletions(-) -- Jens Axboe