On Wed, May 08, 2024 at 12:35:52PM +0100, Pavel Begunkov wrote: > On 5/8/24 08:16, Daniel Vetter wrote: > > On Tue, May 07, 2024 at 08:32:47PM -0300, Jason Gunthorpe wrote: > > > On Tue, May 07, 2024 at 08:35:37PM +0100, Pavel Begunkov wrote: > > > > On 5/7/24 18:56, Jason Gunthorpe wrote: > > > > > On Tue, May 07, 2024 at 06:25:52PM +0100, Pavel Begunkov wrote: > > > > > > On 5/7/24 17:48, Jason Gunthorpe wrote: > > > > > > > On Tue, May 07, 2024 at 09:42:05AM -0700, Mina Almasry wrote: > > > > > > > > > > > > > > > 1. Align with devmem TCP to use udmabuf for your io_uring memory. I > > > > > > > > think in the past you said it's a uapi you don't link but in the face > > > > > > > > of this pushback you may want to reconsider. > > > > > > > > > > > > > > dmabuf does not force a uapi, you can acquire your pages however you > > > > > > > want and wrap them up in a dmabuf. No uapi at all. > > > > > > > > > > > > > > The point is that dmabuf already provides ops that do basically what > > > > > > > is needed here. We don't need ops calling ops just because dmabuf's > > > > > > > ops are not understsood or not perfect. Fixup dmabuf. > > > > > > > > > > > > Those ops, for example, are used to efficiently return used buffers > > > > > > back to the kernel, which is uapi, I don't see how dmabuf can be > > > > > > fixed up to cover it. > > > > > > > > > > Sure, but that doesn't mean you can't use dma buf for the other parts > > > > > of the flow. The per-page lifetime is a different topic than the > > > > > refcounting and access of the entire bulk of memory. > > > > > > > > Ok, so if we're leaving uapi (and ops) and keep per page/sub-buffer as > > > > is, the rest is resolving uptr -> pages, and passing it to page pool in > > > > a convenient to page pool format (net_iov). > > > > > > I'm not going to pretend to know about page pool details, but dmabuf > > > is the way to get the bulk of pages into a pool within the net stack's > > > allocator and keep that bulk properly refcounted while. > > > > > > An object like dmabuf is needed for the general case because there are > > > not going to be per-page references or otherwise available. > > > > > > What you seem to want is to alter how the actual allocation flow works > > > from that bulk of memory and delay the free. It seems like a different > > > topic to me, and honestly hacking into the allocator free function > > > seems a bit weird.. > > > > Also I don't see how it's an argument against dma-buf as the interface for > > It's not, neither I said it is, but it is an argument against removing > the network's page pool ops. > > > all these, because e.g. ttm internally does have a page pool because > > depending upon allocator, that's indeed beneficial. Other drm drivers have > > more buffer-based concepts for opportunistically memory around, usually > > by marking buffers that are just kept as cache as purgeable (which is a > > concept that goes all the way to opengl/vulkan). > > Because in this case it solves nothing and helps with nothing, quite > the opposite. Just as well we can ask why NVMe doesn't wrap user pages > into a dmabuf while doing IO. Because the rules around memory reclaim, gfp nesting and guaranteed forward progress don't match up for block i/o. I looked quite a bit into gluing direct i/o into dma-buf because there's vulkan extensions for that, and it's an absolute mess. -Sima > > > But these are all internals of the dma-buf exporter, the dma-buf api users > > don't ever need to care. > > -Sima > > -- > Pavel Begunkov -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch