On 10/24/24 10:23, Christoph Hellwig wrote:
On Wed, Oct 23, 2024 at 03:34:53PM +0100, Pavel Begunkov wrote:
It doesn't care much what kind of memory it is, nor it's important
for internals how it's imported, it's user addresses -> pages for
user convenience sake. All the net_iov setup code is in the page pool
core code. What it does, however, is implementing the user API, so
That's not what this series does. It adds the new memory_provider_ops
set of hooks, with once implementation for dmabufs, and one for
io_uring zero copy.
First, it's not a _new_ abstraction over a buffer as you called it
before, the abstraction (net_iov) is already merged.
Second, you mention devmem TCP, and it's not just a page pool with
"dmabufs", it's a user API to use it and other memory agnostic
allocation logic. And yes, dmabufs there is the least technically
important part. Just having a dmabuf handle solves absolutely nothing.
So you are precluding zero copy RX into anything but your magic
io_uring buffers, and using an odd abstraction for that.
Right io_uring zero copy RX API expects transfer to happen into io_uring
controlled buffers, and that's the entire idea. Buffers that are based
on an existing network specific abstraction, which are not restricted to
pages or anything specific in the long run, but the flow of which from
net stack to user and back is controlled by io_uring. If you worry about
abuse, io_uring can't even sanely initialise those buffers itself and
therefore asking the page pool code to do that.
The right way would be to support zero copy RX into every
designated dmabuf, and make io_uring work with udmabuf or if
I have no idea what you mean, but shoving dmabufs into every single
place regardless whether it makes sense or not is hardly a good
way forward.
absolutely needed it's own kind of dmabuf. Instead we create
I'm even more confused how that would help. The user API has to
be implemented and adding a new dmabuf gives nothing, not even
mentioning it's not clear what semantics of that beast is
supposed to be.
a maze of incompatible abstractions here. The use case of e.g.
doing zero copy receive into a NVMe CMB using PCIe P2P transactions
is every but made up, so this does create a problem.
That's some kind of a confusion again, there is no reason why
it can't be supported, transparently to the non-setup code at
that. That's left out as other bits to further iterations to
keep this set simpler.
--
Pavel Begunkov