Subsystems providing external ubufs to the net layer, i.e. ->msg_ubuf, might have a better way to refcount it. For instance, io_uring can ammortise ref allocation. Add a way to pass one extra ref to ->msg_ubuf into the network stack by setting struct msghdr::msg_ubuf_ref bit. Whoever consumes the ref should clear the flat. If not consumed, it's the responsibility of the caller to put it. Make __ip{,6}_append_data() to use it. Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> --- include/linux/socket.h | 1 + net/ipv4/ip_output.c | 3 +++ net/ipv6/ip6_output.c | 3 +++ 3 files changed, 7 insertions(+) diff --git a/include/linux/socket.h b/include/linux/socket.h index ba84ee614d5a..ae869dee82de 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -72,6 +72,7 @@ struct msghdr { * to be non-NULL. */ bool msg_managed_data : 1; + bool msg_ubuf_ref : 1; unsigned int msg_flags; /* flags on received message */ __kernel_size_t msg_controllen; /* ancillary data buffer length */ struct kiocb *msg_iocb; /* ptr to iocb for async requests */ diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 3fd1bf675598..d73ec0a73bd2 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1032,6 +1032,9 @@ static int __ip_append_data(struct sock *sk, paged = true; zc = true; uarg = msg->msg_ubuf; + /* we might've been given a free ref */ + extra_uref = msg->msg_ubuf_ref; + msg->msg_ubuf_ref = false; } } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb)); diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index f4138ce6eda3..90bbaab21dbc 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1557,6 +1557,9 @@ static int __ip6_append_data(struct sock *sk, paged = true; zc = true; uarg = msg->msg_ubuf; + /* we might've been given a free ref */ + extra_uref = msg->msg_ubuf_ref; + msg->msg_ubuf_ref = false; } } else if (sock_flag(sk, SOCK_ZEROCOPY)) { uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb)); -- 2.36.1