On Tue, 18 Apr 2023 22:16:53 -0700 Christoph Hellwig wrote: > On Mon, Apr 17, 2023 at 11:19:47PM -0700, Jakub Kicinski wrote: > > Damn, that's unfortunate. Thinking aloud -- that means that if we want > > to continue to pull memory management out of networking drivers to > > improve it for all, cross-optimize with the rest of the stack and > > allow various upcoming forms of zero copy -- then we need to add an > > equivalent of dma_ops and DMA API locally in networking? > > Can you explain what the actual use case is? > > From the original patchset I suspect it is dma mapping something very > long term and then maybe doing syncs on it as needed? In this case yes, pinned user memory, it gets sliced up into MTU sized chunks, fed into an Rx queue of a device, and user can see packets without any copies. Quite similar use case #2 is upcoming io_uring / "direct placement" patches (former from Meta, latter for Google) which will try to receive just the TCP data into pinned user memory. And, as I think Olek mentioned, #3 is page_pool - which allocates 4k pages, manages the DMA mappings, gives them to the device and tries to recycle back to the device once TCP is done with them (avoiding the unmapping and even atomic ops on the refcount, as in the good case page refcount is always 1). See page_pool_return_skb_page() for the recycling flow. In all those cases it's more flexible (and faster) to hide the DMA mapping from the driver. All the cases are also opt-in so we don't need to worry about complete oddball devices. And to answer your question in all cases we hope mapping/unmapping will be relatively rare while syncing will be frequent. AFAIU the patch we're discussing implements custom dma_ops for case #1, but the same thing will be needed for #2, and #3. Question to me is whether we need netdev-wide net_dma_ops or device model can provide us with a DMA API that'd work for SoC/PCIe/virt devices.