On Mon, Jun 10, 2024 at 02:38:18PM +0200, Christian König wrote: > Am 10.06.24 um 14:16 schrieb Jason Gunthorpe: > > On Mon, Jun 10, 2024 at 02:07:01AM +0100, Pavel Begunkov wrote: > > > On 6/10/24 01:37, David Wei wrote: > > > > On 2024-06-07 17:52, Jason Gunthorpe wrote: > > > > > IMHO it seems to compose poorly if you can only use the io_uring > > > > > lifecycle model with io_uring registered memory, and not with DMABUF > > > > > memory registered through Mina's mechanism. > > > > By this, do you mean io_uring must be exclusively used to use this > > > > feature? > > > > > > > > And you'd rather see the two decoupled, so userspace can register w/ say > > > > dmabuf then pass it to io_uring? > > > Personally, I have no clue what Jason means. You can just as > > > well say that it's poorly composable that write(2) to a disk > > > cannot post a completion into a XDP ring, or a netlink socket, > > > or io_uring's main completion queue, or name any other API. > > There is no reason you shouldn't be able to use your fast io_uring > > completion and lifecycle flow with DMABUF backed memory. Those are not > > widly different things and there is good reason they should work > > together. > > Well there is the fundamental problem that you can't use io_uring to > implement the semantics necessary for a dma_fence. > > That's why we had to reject the io_uring work on DMA-buf sharing from Google > a few years ago. > > But this only affects the dma_fence synchronization part of DMA-buf, but > *not* the general buffer sharing. More precisely, it only impacts the userspace/data access implicit synchronization part of dma-buf. For tracking buffer movements like on invalidations/refault with a dynamic dma-buf importer/exporter I think the dma-fence rules are acceptable. At least they've been for rdma drivers. But the escape hatch is to (temporarily) pin the dma-buf, which is exactly what direct I/O also does when accessing pages. So aside from the still unsolved question on how we should account/track pinned dma-buf, there shouldn't be an issue. Or at least I'm failing to see one. And for synchronization to data access the dma-fence stuff on dma-buf is anyway rather deprecated on the gpu side too, exactly because of all these limitations. On the gpu side we've been moving to free-standing drm_syncobj instead, but those are fairly gpu specific and any other subsystem should be able to just reuse what they have already to signal transaction completions. Cheers, Sima > > Regards, > Christian. > > > > > Pretending they are totally different just because two different > > people wrote them is a very siloed view. > > > > > The devmem TCP callback can implement it in a way feasible to > > > the project, but it cannot directly post events to an unrelated > > > API like io_uring. And devmem attaches buffers to a socket, > > > for which a ring for returning buffers might even be a nuisance. > > If you can't compose your io_uring completion mechanism with a DMABUF > > provided backing store then I think it needs more work. > > > > Jason > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch