Actually, I'd rather have something like an 'inverse io_uring', where
an application creates a memory region separated into several 'ring'
for submission and completion.
Then the kernel could write/map the incoming data onto the rings, and
application can read from there.
Maybe it'll be worthwhile to look at virtio here.
There is lio loopback backed by tcmu... I'm assuming that nvmet can
hook into the same/similar interface. nvmet is pretty lean, and we
can probably help tcmu/equivalent scale better if that is a concern...
Sagi,
I looked at tcmu prior to starting this work. Other than the tcmu
overhead, one concern was the complexity of a scsi device interface
versus sending block requests to userspace.
The complexity is understandable, though it can be viewed as a
capability as well. Note I do not have any desire to promote tcmu here,
just trying to understand if we need a brand new interface rather than
making the existing one better.
What would be the advantage of doing it as a nvme target over delivering
directly to userspace as a block driver?
Well, for starters you gain the features and tools that are extensively
used with nvme. Plus you get the ecosystem support (development,
features, capabilities and testing). There are clear advantages of
plugging into an established ecosystem.
Also, when considering the case where userspace wants to just look at the IO
descriptor, without actually sending data to userspace, I'm not sure
that would be doable with tcmu?
Again, if tcmu is not a good starting point (never ran it myself) we can
think of starting with a clean slate.
Another attempt to do the same thing here, now with device-mapper:
https://patchwork.kernel.org/project/dm-devel/patch/20201203215859.2719888-4-palmer@xxxxxxxxxxx/
I largely agree with the feedback given on this attempt.