On Thu, Feb 13, 2025 at 5:17 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > > On 2/12/25 19:18, Mina Almasry wrote: > > On Wed, Feb 12, 2025 at 7:52 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > >> > >> On 2/10/25 21:09, Mina Almasry wrote: > >>> On Wed, Feb 5, 2025 at 4:20 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > >>>> > >>>> On 2/3/25 22:39, Mina Almasry wrote: > >>>> ... > >>>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > >>>>> index bb2b751d274a..3ff8f568c382 100644 > >>>>> --- a/include/linux/skbuff.h > >>>>> +++ b/include/linux/skbuff.h > >>>>> @@ -1711,9 +1711,12 @@ struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size, > >>>> ... > >>>>> int zerocopy_fill_skb_from_iter(struct sk_buff *skb, > >>>>> struct iov_iter *from, size_t length); > >>>>> @@ -1721,12 +1724,14 @@ int zerocopy_fill_skb_from_iter(struct sk_buff *skb, > >>>>> static inline int skb_zerocopy_iter_dgram(struct sk_buff *skb, > >>>>> struct msghdr *msg, int len) > >>>>> { > >>>>> - return __zerocopy_sg_from_iter(msg, skb->sk, skb, &msg->msg_iter, len); > >>>>> + return __zerocopy_sg_from_iter(msg, skb->sk, skb, &msg->msg_iter, len, > >>>>> + NULL); > >>>> > >>>> Instead of propagating it all the way down and carving a new path, why > >>>> not reuse the existing infra? You already hook into where ubuf is > >>>> allocated, you can stash the binding in there. And > >>> > >>> It looks like it's not possible to increase the side of ubuf_info at > >>> all, otherwise the BUILD_BUG_ON in msg_zerocopy_alloc() fires. > >>> > >>> It's asserting that sizeof(ubuf_info_msgzc) <= sizeof(skb->cb), and > >>> I'm guessing increasing skb->cb size is not really the way to go. > >>> > >>> What I may be able to do here is stash the binding somewhere in > >>> ubuf_info_msgzc via union with fields we don't need for devmem, and/or > >> > >> It doesn't need to account the memory against the user, and you > >> actually don't want that because dmabuf should take care of that. > >> So, it should be fine to reuse ->mmp. > >> > >> It's also not a real sk_buff, so maybe maintainers wouldn't mind > >> reusing some more space out of it, if that would even be needed. > >> > > > > netmem skb are real sk_buff, with the modification that frags are not > > We were discussing ubuf_info allocation, take a look at > msg_zerocopy_alloc(), it has nothing to do with netmems and all that. > Yes. My response was regarding the suggestion that we can use space in devmem skbs however we want though. > > readable, only in the case that the netmem is unreadable. I would not > > approve of considering netmem/devmem skbs "not real skbs", and start > > messing with the semantics of skb fields for devmem skbs, and having > > to start adding skb_is_devmem() checks through all code in the skb > > handlers that touch the fields being overwritten in the devmem case. > > No, I don't think we can re-use random fields in the skb for devmem. > > > >>> stashing the binding in ubuf_info_ops (very hacky). Neither approach > >>> seems ideal, but the former may work and may be cleaner. > >>> > >>> I'll take a deeper look here. I had looked before and concluded that > >>> we're piggybacking devmem TX on MSG_ZEROCOPY path, because we need > >>> almost all of the functionality there (no copying, send complete > >>> notifications, etc), with one minor change in the skb filling. I had > >>> concluded that if MSG_ZEROCOPY was never updated to use the existing > >>> infra, then it's appropriate for devmem TX piggybacking on top of it > >> > >> MSG_ZEROCOPY does use the common infra, i.e. passing ubuf_info, > >> but doesn't need ->sg_from_iter as zerocopy_fill_skb_from_iter() > >> and it's what was there first. > >> > > > > But MSG_ZEROCOPY doesn't set msg->msg_ubuf. And not setting > > msg->msg_ubuf fails to trigger msg->sg_from_iter altogether. > > > > And also currently sg_from_iter isn't set up to take in a ubuf_info. > > We'd need that if we stash the binding in the ubuf_info. > > https://github.com/isilence/linux.git sg-iter-ops > > I have old patches for all of that, they even rebased cleanly. That > should do it for you, and I need to send then regardless of devmem. > > These patches help a bit, but do not make any meaningful dent in addressing the concern I have in the earlier emails. The concern is that we're piggybacking devmem TX on MSG_ZEROCOPY, and currently the MSG_ZEROCOPY code carefully avoids any code paths setting msg->[sg_from_iter|msg_ubuf]. If we want devmem to reuse both the MSG_ZEROCOPY mechanisms and the msg->[sg_from_iter|ubuf_info] mechanism, I have to dissect the MSG_ZEROCOPY code carefully so that it works with and without setting msg->[ubuf_info|msg->sg_from_iter]. Having gone through this rabbit hole so far I see that it complicates the implementation and adds more checks to the fast MSG_ZEROCOPY paths. The complication could be worth it if there was some upside, but I don't see one tbh. Passing the binding down to zerocopy_fill_skb_from_devmem seems like a better approach to my eye so far I'm afraid I'm going to table this for now. If there is overwhelming consensus that msg->sg_from_iter is the right approach here I will revisit, but it seems to me to complicate code without a significant upside. -- Thanks, Mina