The introduction seems to be missing on the fsdevel list On 3/21/23 02:10, Bernd Schubert wrote: > This adds support for uring communication between kernel and > userspace daemon using opcode the IORING_OP_URING_CMD. The basic > appraoch was taken from ublk. The patches are in RFC state - > I'm not sure about all decisions and some questions are marked > with XXX. > > Userspace side has to send IOCTL(s) to configure ring queue(s) > and it has the choice to configure exactly one ring or one > ring per core. If there are use case we can also consider > to allow a different number of rings - the ioctl configuration > option is rather generic (number of queues). > > Right now a queue lock is taken for any ring entry state change, > mostly to correctly handle unmount/daemon-stop. In fact, > correctly stopping the ring took most of the development > time - always new corner cases came up. > I had run dozens of xfstest cycles, > versions I had once seen a warning about the ring start_stop > mutex being the wrong state - probably another stop issue, > but I have not been able to track it down yet. > Regarding the queue lock - I still need to do profiling, but > my assumption is that it should not matter for the > one-ring-per-core configuration. For the single ring config > option lock contention might come up, but I see this > configuration mostly for development only. > Adding more complexity and protecting ring entries with > their own locks can be done later. > > Current code also keep the fuse request allocation, initially > I only had that for background requests when the ring queue > didn't have free entries anymore. The allocation is done > to reduce initial complexity, especially also for ring stop. > The allocation free mode can be added back later. > > Right now always the ring queue of the submitting core > is used, especially for page cached background requests > we might consider later to also enqueue on other core queues > (when these are not busy, of course). > > Splice/zero-copy is not supported yet, all requests go > through the shared memory queue entry buffer. I also > following splice and ublk/zc copy discussions, I will > look into these options in the next days/weeks. > To have that buffer allocated on the right numa node, > a vmalloc is done per ring queue and on the numa node > userspace daemon side asks for. > My assumption is that the mmap offset parameter will be > part of a debate and I'm curious what other think about > that appraoch. > > Benchmarking and tuning is on my agenda for the next > days. For now I only have xfstest results - most longer > running tests were running at about 2x, but somehow when > I cleaned up the patches for submission I lost that. > My development VM/kernel has all sanitizers enabled - > hard to profile what happened. Performance > results with profiling will be submitted in a few days. > > The patches include a design document, which has a few more > details. > > The corresponding libfuse patches are on my uring branch, > but need cleanup for submission - will happen during the next > days. > https://github.com/bsbernd/libfuse/tree/uring > > If it should make review easier, patches posted here are on > this branch > https://github.com/bsbernd/linux/tree/fuse-uring-for-6.2 > > > Bernd Schubert (13): > fuse: Add uring data structures and documentation > fuse: rename to fuse_dev_end_requests and make non-static > fuse: Move fuse_get_dev to header file > Add a vmalloc_node_user function > fuse: Add a uring config ioctl and ring destruction > fuse: Add an interval ring stop worker/monitor > fuse: Add uring mmap method > fuse: Move request bits > fuse: Add wait stop ioctl support to the ring > fuse: Handle SQEs - register commands > fuse: Add support to copy from/to the ring buffer > fuse: Add uring sqe commit and fetch support > fuse: Allow to queue to the ring > > Documentation/filesystems/fuse-uring.rst | 179 +++ > fs/fuse/Makefile | 2 +- > fs/fuse/dev.c | 193 +++- > fs/fuse/dev_uring.c | 1292 ++++++++++++++++++++++ > fs/fuse/dev_uring_i.h | 23 + > fs/fuse/fuse_dev_i.h | 62 ++ > fs/fuse/fuse_i.h | 178 +++ > fs/fuse/inode.c | 10 + > include/linux/vmalloc.h | 1 + > include/uapi/linux/fuse.h | 131 +++ > mm/nommu.c | 6 + > mm/vmalloc.c | 41 +- > 12 files changed, 2064 insertions(+), 54 deletions(-) > create mode 100644 Documentation/filesystems/fuse-uring.rst > create mode 100644 fs/fuse/dev_uring.c > create mode 100644 fs/fuse/dev_uring_i.h > create mode 100644 fs/fuse/fuse_dev_i.h > > Signed-off-by: Bernd Schubert <bschubert@xxxxxxx> > cc: Miklos Szeredi <miklos@xxxxxxxxxx> > cc: linux-fsdevel@xxxxxxxxxxxxxxx > cc: Amir Goldstein <amir73il@xxxxxxxxx> > cc: fuse-devel@xxxxxxxxxxxxxxxxxxxxx >