> >> # discussion / questions > >> > >> I haven't got a grasp on many aspects of the net stack yet, so would > >> appreciate feedback in general and there are a couple of questions > >> thoughts. > >> > >> 1) What are initialisation rules for adding a new field into > >> struct mshdr? E.g. many users (mainly LLD) hand code initialisation not > >> filling all the fields. > >> > >> 2) I don't like too much ubuf_info propagation from udp_sendmsg() into > >> __ip_append_data() (see 3/12). Ideas how to do it better? > > > > Agreed that both of these are less than ideal. > > > > I can't comment too much on the io_uring aspect of the patch series. > > But msg_zerocopy is probably used in a small fraction of traffic (even > > if a high fraction for users who care about its benefits). We have to > > try to minimize the cost incurred on the general hot path. > > One thing, I can hide the initial ubuf check in the beginning of > __ip_append_data() under a common > > if (sock_flag(sk, SOCK_ZEROCOPY)) {} > > But as SOCK_ZEROCOPY is more of a design problem workaround, > tbh not sure I like from the API perspective. Thoughts? Agreed. io_uring does not have the legacy concerns that msg_zerocopy had to resolve. It is always possible to hide runtime overhead behind a static_branch, if nothing else. Or perhaps do pass the flag and use that: - if (flags & MSG_ZEROCOPY && length && sock_flag(sk, SOCK_ZEROCOPY)) { + if (flags & MSG_ZEROCOPY && length) { + if (uarg) { etc. > I hope > I can also shuffle some of the stuff in 5/12 out of the > hot path, need to dig a bit deeper. > > > I was going to suggest using the standard msg_zerocopy ubuf_info > > alloc/free mechanism. But you explicitly mention seeing omalloc/ofree > > in the cycle profile. > > > > It might still be possible to somehow signal to msg_zerocopy_alloc > > that this is being called from within an io_uring request, and > > therefore should use a pre-existing uarg with different > > uarg->callback. If nothing else, some info can be passed as a cmsg. > > But perhaps there is a more direct pointer path to follow from struct > > sk, say? Here my limited knowledge of io_uring forces me to hand wave. > > One thing I consider important though is to be able to specify a > ubuf per request, but not somehow registering it in a socket. It's > more flexible from the userspace API perspective. It would also need > constant register/unregister, and there are concerns with > referencing/cancellations, that's where it came from in the first > place. What if the ubuf pool can be found from the sk, and the index in that pool is passed as a cmsg?