On 7/21/22 10:38, Paul Cercueil wrote:
Hi Pavel, Good job on the io_uring zerocopy stuff, that looks really interesting! I'm working on adding a new userspace/kernelspace buffer interface for the IIO subsystem. My first idea (a few years ago already) was to add support for splice(), so that the data could be sent from IIO hardware directly to file or to the network. It turned out not working really well because of how splice() works. The kernel would erase pages to be exchanged with the pipe data pages, so the speed gains obtained by not copying data pages were underwhelming and the CPU usage was almost as high (CPU usage being our limiting factor here). We then settled for a dmabuf-based interface [1] which works great as a userspace/kernelspace interface, but doesn't allow zero-copy to disk or network (until someone adds support for it, I guess). The patchset got refused on the basis that (against all documentation) dmabuf really is a gpu/drm thing and shouldn't be used elsewhere.
The idea I've got is that passing buffers as dmabufs is the only viable approach, especially since GPU <-> NIC transfers are of much interest and there were attempts of exposing NVME's CMB as dma-bufs (not sure where did it end).
My question for you is, would your new io_uring zerocopy work allow for instance to transfer data from storage to the network, without triggering this "page clearing" mechanism that splice() has?
That's the plan. We prototyped it before but needs some more work to be done.
[1] https://lore.kernel.org/linux-doc/20220207125933.81634-7-paul@xxxxxxxxxxxxxxx/T/
-- Pavel Begunkov