On Thu, Dec 26, 2024 at 11:07 AM Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: > > On 12/21, Mina Almasry wrote: > > Currently net_iovs support only pp ref counts, and do not support a > > page ref equivalent. > > > > This is fine for the RX path as net_iovs are used exclusively with the > > pp and only pp refcounting is needed there. The TX path however does not > > use pp ref counts, thus, support for get_page/put_page equivalent is > > needed for netmem. > > > > Support get_netmem/put_netmem. Check the type of the netmem before > > passing it to page or net_iov specific code to obtain a page ref > > equivalent. > > > > For dmabuf net_iovs, we obtain a ref on the underlying binding. This > > ensures the entire binding doesn't disappear until all the net_iovs have > > been put_netmem'ed. We do not need to track the refcount of individual > > dmabuf net_iovs as we don't allocate/free them from a pool similar to > > what the buddy allocator does for pages. > > > > This code is written to be extensible by other net_iov implementers. > > get_netmem/put_netmem will check the type of the netmem and route it to > > the correct helper: > > > > pages -> [get|put]_page() > > dmabuf net_iovs -> net_devmem_[get|put]_net_iov() > > new net_iovs -> new helpers > > > > Signed-off-by: Mina Almasry <almasrymina@xxxxxxxxxx> > > > > --- > > include/linux/skbuff_ref.h | 4 ++-- > > include/net/netmem.h | 3 +++ > > net/core/devmem.c | 10 ++++++++++ > > net/core/devmem.h | 11 +++++++++++ > > net/core/skbuff.c | 30 ++++++++++++++++++++++++++++++ > > 5 files changed, 56 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/skbuff_ref.h b/include/linux/skbuff_ref.h > > index 0f3c58007488..9e49372ef1a0 100644 > > --- a/include/linux/skbuff_ref.h > > +++ b/include/linux/skbuff_ref.h > > @@ -17,7 +17,7 @@ > > */ > > static inline void __skb_frag_ref(skb_frag_t *frag) > > { > > - get_page(skb_frag_page(frag)); > > + get_netmem(skb_frag_netmem(frag)); > > } > > > > /** > > @@ -40,7 +40,7 @@ static inline void skb_page_unref(netmem_ref netmem, bool recycle) > > if (recycle && napi_pp_put_page(netmem)) > > return; > > #endif > > [..] > > > - put_page(netmem_to_page(netmem)); > > + put_netmem(netmem); > > I moved the release operation onto a workqueue in my series [1] to avoid > calling dmabuf detach (which can sleep) from the socket close path > (which is called with bh disabled). You should probably do something similar, > see the trace attached below. > > 1: https://github.com/fomichev/linux/commit/3b3ad4f36771a376c204727e5a167c4993d4c65a#diff-3c58b866674b2f9beb5ac7349f81566e4df595c25c647710203549589d450f2dR436 > > (the condition to trigger that is to have an skb in the write queue > and call close from the userspace) > Thanks for catching this indeed. I've also changed the unbinding to scheduled_work, although I arrived at a slightly different implementation that is simpler to my eye. I'll upload what I have for review shortly as RFC. -- Thanks, Mina