Re: [LSF/MM/BPF TOPIC] block drivers in user space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/14/22 20:21, Bart Van Assche wrote:
On 3/13/22 14:15, Sagi Grimberg wrote:

We don't want to re-use tcmu's interface.

Bodo has been looking into on a new interface to avoid issues tcmu has
and to improve performance. If it's allowed to add a tcmu like backend to
nvmet then it would be great because lio was not really made with mq and
perf in mind so it already starts with issues. I just started doing the
basics like removing locks from the main lio IO path but it seems like
there is just so much work.

Good to know...

So I hear there is a desire to do this. So I think we should list the
use-cases for this first because that would lead to different design
choices.. For example one use-case is just to send read/write/flush
to userspace, another may want to passthru nvme commands to userspace
and there may be others...

(resending my reply without truncating the Cc-list)

Hi Sagi,

Haven't these use cases already been mentioned in the email at the start of this thread? The use cases I am aware of are implementing cloud-specific block storage functionality and also block storage in user space for Android. Having to parse NVMe commands and PRP or SGL lists would be an unnecessary source of complexity and overhead for these use cases. My understanding is that what is needed for these use cases is something that is close to the block layer request interface (REQ_OP_* + request flags + data buffer).


Curiously, the former was exactly my idea. I was thinking about having a simple nvmet userspace driver where all the transport 'magic' was handled in the nvmet driver, and just the NVMe SQEs passed on to the userland driver. The userland driver would then send the CQEs back to the driver. With that the kernel driver becomes extremely simple, and would allow userspace to do all the magic it wants. More to the point, one could implement all sorts of fancy features which are out of scope for the current nvmet implementation. Which is why I've been talking about 'inverse' io_uring; the userland driver will have to wait for SQEs, and write CQEs back to the driver.

Cheers,

Hannes
--
Dr. Hannes Reinecke                Kernel Storage Architect
hare@xxxxxxx                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux