Re: [PATCH net-next v8 11/17] io_uring/zcrx: implement zerocopy receive pp memory provider

Pavel Begunkov <asml.silence@xxxxxxxxx> · Tue, 10 Dec 2024 04:50:40 +0000

On 12/10/24 04:45, Pavel Begunkov wrote:
On 12/10/24 04:01, Jakub Kicinski wrote:
On Wed,  4 Dec 2024 09:21:50 -0800 David Wei wrote:
Then, either the buffer is dropped and returns back to the page pool
into the ->freelist via io_pp_zc_release_netmem, in which case the page
pool will match hold_cnt for us with ->pages_state_release_cnt. Or more
likely the buffer will go through the network/protocol stacks and end up
in the corresponding socket's receive queue. From there the user can get
it via an new io_uring request implemented in following patches. As
mentioned above, before giving a buffer to the user we bump the refcount
by IO_ZC_RX_UREF.

Once the user is done with the buffer processing, it must return it back
via the refill queue, from where our ->alloc_netmems implementation can
grab it, check references, put IO_ZC_RX_UREF, and recycle the buffer if
there are no more users left. As we place such buffers right back into
the page pools fast cache and they didn't go through the normal pp
release path, they are still considered "allocated" and no pp hold_cnt
is required. For the same reason we dma sync buffers for the device
in io_zc_add_pp_cache().

Can you say more about the IO_ZC_RX_UREF bias? net_iov is not the page
struct, we can add more fields. In fact we have 8B of padding in it
that can be allocated without growing the struct. So why play with

I guess we can, though it's growing it for everyone not just
io_uring considering how indexing works, i.e. no embedding into
a larger struct.

biases? You can add a 32b atomic counter for how many refs have been
handed out to the user.

This set does it in a stupid way, but the bias allows to coalesce
operations with it into a single atomic. Regardless, it can be
placed separately, though we still need a good way to optimise
counting. Take a look at my reply with questions in the v7 thread,
I outlined what can work quite well in terms of performance but
needs a clear api for that from net/

FWIW, I tried it and placed user refs into a separate array.
Without optimisations it'll be additional atomics + cache
bouncing, which is not great, but if we can somehow reuse the
frag ref as in replies to v7, that might work even better than
with the bias. Devmem might reuse that as well.

--
Pavel Begunkov