Re: [PATCH RFC] nfsd: Fix NFSv4 READ on RDMA when using readv

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jan 15, 2020, at 3:37 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> 
> svcrdma expects that the READ payload falls precisely into the
> xdr_buf's page vector. Adding "xdr->iov = NULL" forces
> xdr_reserve_space() to always use pages from xdr->buf->pages when
> calling nfsd_readv.
> 
> Also, the XDR padding is problematic. For NFS/RDMA Write chunks,
> the padding needs to be in xdr->buf->tail so that the transport can
> skip over it. However for NFS/TCP and the NFS/RDMA Reply chunks,
> the padding has to be retained. Not yet sure how to add this.
> 
> Fixes: b04209806384 ("nfsd4: allow exotic read compounds")
> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> ---
> Howdy Bruce-
> 
> I'm struggling with nfsd4_encode_readv().
> 
> - for NFS/RDMA Write chunks, the READ payload has to be in
>  buf->pages. I've fixed that.
> 
> - xdr_reserve_space() calls don't need to explicitly align the
>  @nbytes argument: xdr_reserve_space() already does this?
> 
> - the while loop probably won't work if a later READ in the COMPOUND
>  doesn't start on a page boundary. This isn't a problem until we
>  run into a Solaris client in forcedirectio mode.
> 
> - the XDR padding doesn't work for NFS/RDMA Write chunks, which are
>  supposed to skip padding altogether.
> 
> Do you have suggestions? Thanks in advance.

I'm experimenting with an idea I think has been mentioned on list
a few times:

Having the RPC layer and transports deal with the padding of the
xdr_buf->pages vector, and moving that responsibility out of the
NFSD Reply encoder functions. xdr_buf->tail[0] then always begins
on an XDR-aligned boundary.

This should be straightforward for NFSv3. The only two Reply
encoders that are updated are READ and READLINK. I'm starting
with that.

Not quite sure yet how krb5i/krb5p will deal with this. Obviously
the page list pad needs to be inserted before each Reply is
wrapped.

I'll post more as experimentation progresses.


> fs/nfsd/nfs4xdr.c |   17 +++++++----------
> 1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
> index d2dc4c0e22e8..14c68a136b4e 100644
> --- a/fs/nfsd/nfs4xdr.c
> +++ b/fs/nfsd/nfs4xdr.c
> @@ -3519,17 +3519,14 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
> 	u32 zzz = 0;
> 	int pad;
> 
> +	/* Ensure xdr_reserve_space behaves itself */
> +	if (xdr->iov == xdr->buf->head) {
> +		xdr->iov = NULL;
> +		xdr->end = xdr->p;
> +	}
> +
> 	len = maxcount;
> 	v = 0;
> -
> -	thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p));
> -	p = xdr_reserve_space(xdr, (thislen+3)&~3);
> -	WARN_ON_ONCE(!p);
> -	resp->rqstp->rq_vec[v].iov_base = p;
> -	resp->rqstp->rq_vec[v].iov_len = thislen;
> -	v++;
> -	len -= thislen;
> -
> 	while (len) {
> 		thislen = min_t(long, len, PAGE_SIZE);
> 		p = xdr_reserve_space(xdr, (thislen+3)&~3);
> @@ -3548,7 +3545,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
> 	read->rd_length = maxcount;
> 	if (nfserr)
> 		return nfserr;
> -	xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
> +	xdr_truncate_encode(xdr, starting_len + 8 + maxcount);
> 
> 	tmp = htonl(eof);
> 	write_bytes_to_xdr_buf(xdr->buf, starting_len    , &tmp, 4);
> 

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux