RE: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Howells wrote:
> Make TCP's sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
> spliced from the source iterator if possible (the iterator must be
> ITER_BVEC and the pages must be spliceable).
> 
> This allows ->sendpage() to be replaced by something that can handle
> multiple multipage folios in a single transaction.
> 
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> cc: Jakub Kicinski <kuba@xxxxxxxxxx>
> cc: Paolo Abeni <pabeni@xxxxxxxxxx>
> cc: Jens Axboe <axboe@xxxxxxxxx>
> cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> cc: netdev@xxxxxxxxxxxxxxx
> ---
>  net/ipv4/tcp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 53 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 288693981b00..77c0c69208a5 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1220,7 +1220,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>  	int flags, err, copied = 0;
>  	int mss_now = 0, size_goal, copied_syn = 0;
>  	int process_backlog = 0;
> -	bool zc = false;
> +	int zc = 0;
>  	long timeo;
>  
>  	flags = msg->msg_flags;
> @@ -1231,17 +1231,24 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>  		if (msg->msg_ubuf) {
>  			uarg = msg->msg_ubuf;
>  			net_zcopy_get(uarg);
> -			zc = sk->sk_route_caps & NETIF_F_SG;
> +			if (sk->sk_route_caps & NETIF_F_SG)
> +				zc = 1;
>  		} else if (sock_flag(sk, SOCK_ZEROCOPY)) {
>  			uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
>  			if (!uarg) {
>  				err = -ENOBUFS;
>  				goto out_err;
>  			}
> -			zc = sk->sk_route_caps & NETIF_F_SG;
> -			if (!zc)
> +			if (sk->sk_route_caps & NETIF_F_SG)
> +				zc = 1;
> +			else
>  				uarg_to_msgzc(uarg)->zerocopy = 0;
>  		}
> +	} else if (unlikely(flags & MSG_SPLICE_PAGES) && size) {
> +		if (!iov_iter_is_bvec(&msg->msg_iter))
> +			return -EINVAL;
> +		if (sk->sk_route_caps & NETIF_F_SG)
> +			zc = 2;
>  	}

The commit message mentions MSG_SPLICE_PAGES as an internal flag.

It can be passed from userspace. The code anticipates that and checks
preconditions.

A side effect is that legacy applications that may already be setting
this bit in the flags now start failing. Most socket types are
historically permissive and simply ignore undefined flags.

With MSG_ZEROCOPY we chose to be extra cautious and added
SOCK_ZEROCOPY, only testing the MSG_ZEROCOPY bit if this socket option
is explicitly enabled. Perhaps more cautious than necessary, but FYI.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux