RE: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: David Howells <dhowells@xxxxxxxxxx>
> Sent: Thursday, 16 March 2023 16:26
> To: Matthew Wilcox <willy@xxxxxxxxxxxxx>; David S. Miller
> <davem@xxxxxxxxxxxxx>; Eric Dumazet <edumazet@xxxxxxxxxx>; Jakub Kicinski
> <kuba@xxxxxxxxxx>; Paolo Abeni <pabeni@xxxxxxxxxx>
> Cc: David Howells <dhowells@xxxxxxxxxx>; Al Viro <viro@xxxxxxxxxxxxxxxxxx>;
> Christoph Hellwig <hch@xxxxxxxxxxxxx>; Jens Axboe <axboe@xxxxxxxxx>; Jeff
> Layton <jlayton@xxxxxxxxxx>; Christian Brauner <brauner@xxxxxxxxxx>; Linus
> Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; linux-
> fsdevel@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> Bernard Metzler <BMT@xxxxxxxxxxxxxx>; Tom Talpey <tom@xxxxxxxxxx>; linux-
> rdma@xxxxxxxxxxxxxxx
> Subject: [EXTERNAL] [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
> 
> do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(),
> so inline it, allowing do_tcp_sendpages() to be removed.  This is part of
> replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set.
> 
> Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
> cc: Bernard Metzler <bmt@xxxxxxxxxxxxxx>
> cc: Tom Talpey <tom@xxxxxxxxxx>
> cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
> cc: Eric Dumazet <edumazet@xxxxxxxxxx>
> cc: Jakub Kicinski <kuba@xxxxxxxxxx>
> cc: Paolo Abeni <pabeni@xxxxxxxxxx>
> cc: Jens Axboe <axboe@xxxxxxxxx>
> cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> cc: linux-rdma@xxxxxxxxxxxxxxx
> cc: netdev@xxxxxxxxxxxxxxx
> ---
>  drivers/infiniband/sw/siw/siw_qp_tx.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 05052b49107f..8fc179321e2b 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -313,7 +313,7 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx,
> struct socket *s,
>  }
> 


Hi David,

many thanks for looking into that!

I reply to a limited audience, expecting limited interest,
basically to the rdma list plus Tom.


>  /*
> - * 0copy TCP transmit interface: Use do_tcp_sendpages.
> + * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES.
>   *
>   * Using sendpage to push page by page appears to be less efficient
>   * than using sendmsg, even if data are copied.

That is an interesting observation. Is efficiency to be read as
CPU load, or throughput on the wire, or both?

Back in the days, I introduced that zcopy path for efficiency
reasons - getting both better throughput and less CPU load.
I looked at both WRITE and READ performance. Using
do_tcp_sendpages() is currently limited to processing work
which is not registered with local completion generation.
Replying to a remote READ request is a typical case. Did
you check with READ?

> @@ -324,20 +324,27 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx,
> struct socket *s,
>  static int siw_tcp_sendpages(struct socket *s, struct page **page, int
> offset,
>  			     size_t size)
>  {
> +	struct bio_vec bvec;
> +	struct msghdr msg = {
> +		.msg_flags = (MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT |
> +			      MSG_SENDPAGE_NOTLAST),
> +	};
>  	struct sock *sk = s->sk;
> -	int i = 0, rv = 0, sent = 0,
> -	    flags = MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST;
> +	int i = 0, rv = 0, sent = 0;
> 
>  	while (size) {
>  		size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
> 
>  		if (size + offset <= PAGE_SIZE)
> -			flags = MSG_MORE | MSG_DONTWAIT;
> +			msg.msg_flags = MSG_SPLICE_PAGES | MSG_MORE |
> MSG_DONTWAIT;
> 
>  		tcp_rate_check_app_limited(sk);
> +		bvec_set_page(&bvec, page[i], bytes, offset);
> +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> +
>  try_page_again:
>  		lock_sock(sk);
> -		rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags);
> +		rv = tcp_sendmsg_locked(sk, &msg, size);

Would that tcp_sendmsg_locked() with a msg flagged
MSG_SPLICE_PAGES still have zero copy semantics?


>  		release_sock(sk);
> 
>  		if (rv > 0) {





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux