On 5/30/24 10:32 AM, Bernd Schubert wrote: > > > On 5/30/24 18:21, Jens Axboe wrote: >> On 5/30/24 10:02 AM, Bernd Schubert wrote: >>> >>> >>> On 5/30/24 17:36, Kent Overstreet wrote: >>>> On Wed, May 29, 2024 at 08:00:35PM +0200, Bernd Schubert wrote: >>>>> From: Bernd Schubert <bschubert@xxxxxxx> >>>>> >>>>> This adds support for uring communication between kernel and >>>>> userspace daemon using opcode the IORING_OP_URING_CMD. The basic >>>>> appraoch was taken from ublk. The patches are in RFC state, >>>>> some major changes are still to be expected. >>>>> >>>>> Motivation for these patches is all to increase fuse performance. >>>>> In fuse-over-io-uring requests avoid core switching (application >>>>> on core X, processing of fuse server on random core Y) and use >>>>> shared memory between kernel and userspace to transfer data. >>>>> Similar approaches have been taken by ZUFS and FUSE2, though >>>>> not over io-uring, but through ioctl IOs >>>> >>>> What specifically is it about io-uring that's helpful here? Besides the >>>> ringbuffer? >>>> >>>> So the original mess was that because we didn't have a generic >>>> ringbuffer, we had aio, tracing, and god knows what else all >>>> implementing their own special purpose ringbuffers (all with weird >>>> quirks of debatable or no usefulness). >>>> >>>> It seems to me that what fuse (and a lot of other things want) is just a >>>> clean simple easy to use generic ringbuffer for sending what-have-you >>>> back and forth between the kernel and userspace - in this case RPCs from >>>> the kernel to userspace. >>>> >>>> But instead, the solution seems to be just toss everything into a new >>>> giant subsystem? >>> >>> >>> Hmm, initially I had thought about writing my own ring buffer, but then >>> io-uring got IORING_OP_URING_CMD, which seems to have exactly what we >>> need? From interface point of view, io-uring seems easy to use here, >>> has everything we need and kind of the same thing is used for ublk - >>> what speaks against io-uring? And what other suggestion do you have? >>> >>> I guess the same concern would also apply to ublk_drv. >>> >>> Well, decoupling from io-uring might help to get for zero-copy, as there >>> doesn't seem to be an agreement with Mings approaches (sorry I'm only >>> silently following for now). >> >> If you have an interest in the zero copy, do chime in, it would >> certainly help get some closure on that feature. I don't think anyone >> disagrees it's a useful and needed feature, but there are different view >> points on how it's best solved. > > We had a bit of discussion with Ming about that last year, besides that > I got busy with other parts, it got a bit less of personal interest for > me as our project really needs to access the buffer (additional > checksums, sending it out over network library (libfabric), possibly > even preprocessing of some data) - I think it makes sense if I work on > the other fuse parts first and only come back zero copy a bit later. Ah I see - yes if you're going to be touching the data anyway, zero copy is less of a concern. Some memory bandwidth can still be saved if you're not touching all of it, of course. But if you are, you're probably better off copying it in the first place. >>> From our side, a customer has pointed out security concerns for io-uring. >> >> That's just bs and fud these days. > > I wasn't in contact with that customer personally, I had just seen their > email.It would probably help if RHEL would eventually gain io-uring > support - almost all of HPC systems are using it or a clone. I was > always hoping that RHEL would get it before I'm done with > fuse-over-io-uring, now I'm not so sure anymore. Not sure what the RHEL status is. I know backports are done on the io_uring side, but not sure what base they are currently on. I strongly suspect that would be a gating factor for getting it enabled. If it's too out of date, then performance isn't going to be as good as current mainline anyway. -- Jens Axboe