> On Jan 15, 2020, at 3:37 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > svcrdma expects that the READ payload falls precisely into the > xdr_buf's page vector. Adding "xdr->iov = NULL" forces > xdr_reserve_space() to always use pages from xdr->buf->pages when > calling nfsd_readv. > > Also, the XDR padding is problematic. For NFS/RDMA Write chunks, > the padding needs to be in xdr->buf->tail so that the transport can > skip over it. However for NFS/TCP and the NFS/RDMA Reply chunks, > the padding has to be retained. Not yet sure how to add this. > > Fixes: b04209806384 ("nfsd4: allow exotic read compounds") > Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053 > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > --- > Howdy Bruce- > > I'm struggling with nfsd4_encode_readv(). > > - for NFS/RDMA Write chunks, the READ payload has to be in > buf->pages. I've fixed that. > > - xdr_reserve_space() calls don't need to explicitly align the > @nbytes argument: xdr_reserve_space() already does this? > > - the while loop probably won't work if a later READ in the COMPOUND > doesn't start on a page boundary. This isn't a problem until we > run into a Solaris client in forcedirectio mode. > > - the XDR padding doesn't work for NFS/RDMA Write chunks, which are > supposed to skip padding altogether. > > Do you have suggestions? Thanks in advance. I'm experimenting with an idea I think has been mentioned on list a few times: Having the RPC layer and transports deal with the padding of the xdr_buf->pages vector, and moving that responsibility out of the NFSD Reply encoder functions. xdr_buf->tail[0] then always begins on an XDR-aligned boundary. This should be straightforward for NFSv3. The only two Reply encoders that are updated are READ and READLINK. I'm starting with that. Not quite sure yet how krb5i/krb5p will deal with this. Obviously the page list pad needs to be inserted before each Reply is wrapped. I'll post more as experimentation progresses. > fs/nfsd/nfs4xdr.c | 17 +++++++---------- > 1 file changed, 7 insertions(+), 10 deletions(-) > > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index d2dc4c0e22e8..14c68a136b4e 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -3519,17 +3519,14 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, > u32 zzz = 0; > int pad; > > + /* Ensure xdr_reserve_space behaves itself */ > + if (xdr->iov == xdr->buf->head) { > + xdr->iov = NULL; > + xdr->end = xdr->p; > + } > + > len = maxcount; > v = 0; > - > - thislen = min_t(long, len, ((void *)xdr->end - (void *)xdr->p)); > - p = xdr_reserve_space(xdr, (thislen+3)&~3); > - WARN_ON_ONCE(!p); > - resp->rqstp->rq_vec[v].iov_base = p; > - resp->rqstp->rq_vec[v].iov_len = thislen; > - v++; > - len -= thislen; > - > while (len) { > thislen = min_t(long, len, PAGE_SIZE); > p = xdr_reserve_space(xdr, (thislen+3)&~3); > @@ -3548,7 +3545,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp, > read->rd_length = maxcount; > if (nfserr) > return nfserr; > - xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3)); > + xdr_truncate_encode(xdr, starting_len + 8 + maxcount); > > tmp = htonl(eof); > write_bytes_to_xdr_buf(xdr->buf, starting_len , &tmp, 4); > -- Chuck Lever